Transacylases of the paclitaxel biosynthetic pathway

ABSTRACT

Transacylase enzymes and the use of such enzymes to produce Taxol™, related taxoids, as well as intermediates in the Taxol™ biosynthetic pathway are disclosed. Also disclosed are nucleic acid sequences encoding the transacylase enzymes.

CROSS REFERENCE TO RELATED CASES

[0001] This application is a continuation in part of co-pending U.S.Application No. 09/411,145, filed Sep. 30, 1999, which is incorporatedherein by reference.

ACKNOWLEDGMENT OF GOVERNMENT SUPPORT

[0002] This invention was made with government support under NationalCancer Institute Grant No. CA-55254. The government has certain rightsin this invention.

FIELD OF THE INVENTION

[0003] The invention relates to transacylase enzymes and methods ofusing such enzymes to produce Taxol™ and related taxoids.

INTRODUCTION

[0004] The complex diterpenoid Taxol™ (paclitaxel) (Wani et al., J. Am.Chem. Soc. 93:2325-2327, 1971) is a potent antimitotic agent withexcellent activity against a wide range of cancers, including ovarianand breast cancer (Arbuck and Blaylock, Taxol™: Science andApplications, CRC Press, Boca Raton, 397-415, 1995; Holmes et al., ACSSymposium Series 583:31-57, 1995). Taxol™ was originally isolated fromthe bark of the Pacific yew (Taxus brevifolia). For a number of years,Taxol™ was obtained exclusively from yew bark, but low yields of thiscompound from the natural source coupled to the destructive nature ofthe harvest, prompted new methods of Taxol™ production to be developed.Taxol™ is currently produced primarily by semisynthesis from advancedtaxane metabolites (Holton et al., Taxol™: Science and Applications, CRCPress, Boca Raton, 97-121, 1995) that are present in the needles (arenewable resource) of various Taxus species. However, because of theincreasing demand for this drug (both for use earlier in the course ofcancer intervention and for new therapeutic applications) (Goldspiel,Pharmacotherapy 17:110S-125S, 1997), availability and cost remainimportant issues. Total chemical synthesis of Taxol™ is not economicallyfeasible. Hence, biological production of the drug and its immediateprecursors will remain the method of choice for the foreseeable future.Such biological production may rely upon either intact Taxus plants,Taxus cell cultures (Ketchum et al., Biotechnol. Bioeng. 62:97-105,1999), or, potentially, microbial systems (Stierle et al., J. Nat. Prod.58:1315-1324, 1995). In all cases, improving the biological productionyields of Taxol depends upon a detailed understanding of thebiosynthetic pathway, the enzymes catalyzing the sequence of reactions,especially the rate-limiting steps, and the genes encoding theseproteins. Isolation of genes encoding enzymes involved in the pathway isa particularly important goal, since overexpression of these genes in aproducing organism can be expected to markedly improve yields of thedrug.

[0005] The Taxol™ biosynthetic pathway is considered to involve morethan 12 distinct steps (Floss and Mocek, Taxol: Science andApplications, CRC Press, Boca Raton, 191-208, 1995; and Croteau et al.,Curr. Top. Plant Physiol. 15:94-104, 1996), however, very few of theenzymatic reactions and intermediates of this complex pathway have beendefined. The first committed enzyme of the Taxol™ pathway is taxadienesynthase (Koepp et al., J. Biol. Chem. 270:8686-8690, 1995) thatcyclizes the common precursor geranylgeranyl diphosphate (Hefner et al.,Arch. Biochem. Biophys. 360:62-74, 1998) to taxadiene (FIG. 1). Thecyclized intermediate subsequently undergoes modification involving atleast eight oxygenation steps, a dehydrogenation, an epoxiderearrangement to an oxetane, and several acylations (Floss and Mocek,Taxol™. Science and Applications, CRC Press, Boca Raton, 191-208, 1995;Croteau et al., Curr. Top. Plant Physiol. 15:94-104, 1996). Taxadienesynthase has been isolated from T. brevifolia and characterized (Hezariet al., Arch. Biochem. Biophys. 322:437-444, 1995), the mechanism ofaction defined (Lin et al., Biochemistry 35:2968-2977, 1996), and thecorresponding cDNA clone isolated and expressed (Wildung and Croteau, J.Biol. Chem. 271:9201-9204, 1996).

[0006] The second specific step of Taxol™ biosynthesis is an oxygenationreaction catalyzed by taxadiene-5∀-hydroxylase (FIG. 1). The enzyme,characterized as a cytochrome P450, has been demonstrated in Taxusmicrosome preparations to catalyze the stereospecific hydroxylation oftaxa-4(5),11(12)-diene, with double bond rearrangement, totaxa-4(20),11(12)-dien-5∀-ol (Hefner et al., Chem. Biol. 3:479-489,1996).

[0007] The third specific step of Taxol™ biosynthesis appears to be theacetylation of taxa-4(20),11(12)-dien-5∀-ol totaxa-4(20),11(12)-dien-5∀-yl acetate by an acetyl CoA-dependenttransacetylase (Walker et al., Arch. Biochem. Biophys. 364:273-279,1999), since the resulting acetate ester is then further efficientlyoxygenated to a series of advanced polyhydroxylated Taxol™ metabolitesin microsomal preparations that have been optimized for cytochrome P450reactions (FIG. 1). The enzyme has been isolated from induced yew cellcultures (Taxus canadensis and Taxus cuspidata), and the operationallysoluble enzyme was partially purified by a combination of anionexchange, hydrophobic interaction, and affinity chromatography onimmobilized coenzyme A resin. This acetyl transacylase has a pI and pHoptimum of 4.7 and 9.0, respectively, and a molecular weight of about50,000 as determined by gel-permeation chromatography. The enzyme showshigh selectivity and high affinity for both cosubstrates with K_(m)values of 4.2 μM and 5.5 μM for taxadienol and acetyl CoA, respectively.The enzyme does not acetylate the more advanced Taxol™ precursors,10-deacetylbaccatin III or baccatin III. This acetyl transacylase isinsensitive to monovalent and divalent metal ions, is only weaklyinhibited by thiol-directed reagents and Co-enzyme A, and in generaldisplays properties similar to those of other O-acetyl transacylases.This acetyl CoA:taxadien-5∀-ol O-acetyl transacylase from Taxus (Walkeret al., Arch. Biochem. Biophys. 364:273-279, 1999) appears to besubstantially different in size, substrate selectivity, and kineticsfrom an acetyl CoA: 10-hydroxytaxane O-acetyl transacylase recentlyisolated and described from Taxus chinensis (Menhard and Zenk,Phytochemistry 50:763-774, 1999).

[0008] Acquisition of the gene encoding the acetyl CoA:taxa-4(20),11(12)-dien-5∀-ol O-acetyl transacylase that catalyzes the first acylationstep of Taxol™ biosynthesis and genes encoding other acyl transfer stepswould represent an important advance in efforts to increase Taxol™yields by genetic engineering and in vitro synthesis.

SUMMARY OF THE INVENTION

[0009] The invention stems from the discovery of twelve amplicons(regions of DNA amplified by a pair of primers using the polymerasechain reaction (PCR)). These amplicons can be used to identifytransacylases, for example, the transacylases shown in SEQ ID NOs: 26,28, 45, 50, 52, 54, 56, and 58 that are encoded by the nucleic acidsequences shown in SEQ ID NOs: 25, 27, 44, 49, 51, 53, 55, and 57. Thesesequences are isolated from the Taxus genus, and the respectivetransacylases are useful for the synthetic production of Taxol™ andrelated taxoids, as well as intermediates within the Taxol™ biosyntheticpathway. The sequences can be also used for the creation of transgenicorganisms that either produce the transacylases for subsequent in vitrouse, or produce the transacylases in vivo so as to alter the level ofTaxol™ and taxoid production within the transgenic organism.

[0010] Another aspect of the invention provides the nucleic acidsequences shown in SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21,and 23 and the corresponding amino acid sequences shown in SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, and 24, respectively, as well asfragments of the nucleic acid and the amino acid sequences. Thesesequences are useful for isolating the nucleic acid and amino acidsequences corresponding to full-length transacylases. These amino acidsequences and nucleic acid sequences are also useful for creatingspecific binding agents that recognize the corresponding transacylases.

[0011] Accordingly, another aspect of the invention provides for theidentification of transacylases and fragments of transacylases that haveamino acid and nucleic acid sequences that vary from the disclosedsequences. For example, the invention provides transacylase amino acidsequences that vary by one or more conservative amino acidsubstitutions, or that share at least 50% sequence identity with theamino acid sequences provided while maintaining transacylase activity.

[0012] The nucleic acid sequences encoding the transacylases andfragments of the transacylases can be cloned, using standard molecularbiology techniques, into vectors. These vectors can then be used totransform host cells. Thus, a host cell can be modified to expresseither increased levels of transacylase or decreased levels oftransacylase.

[0013] Another aspect of the invention provides methods for isolatingnucleic acid sequences encoding full-length transacylases. The methodsinvolve hybridizing at least ten contiguous nucleotides of any of thenucleic acid sequences shown in SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, 15,17, 19, 21, 23, 25, 27, 44, 49, 51, 53, 55, and 57 to a second nucleicacid sequence, wherein the second nucleic acid sequence encodes atransacylase. This method can be practiced in the context of, forexample, Northern blots, Southern blots, and the polymerase chainreaction (PCR). Hence, the invention also provides the transacylasesidentified by this method.

[0014] Yet another aspect of the invention involves methods of adding atleast one acyl group to at least one taxoid. These methods can bepracticed in vivo or in vitro, and can be used to add acyl groups tovarious intermediates in the Taxol™ biosynthetic pathway, and to addacyl groups to related taxoids that are not necessarily in a Taxol™biosynthetic pathway.

SEQUENCE LISTINGS

[0015] The nucleic and amino acid sequences listed in the accompanyingsequence listing are shown using standard letter abbreviations fornucleotide bases, and three-letter code for amino acids. Only one strandof each nucleic acid sequence is shown, but the complementary strand isunderstood to be included by any reference to the displayed strand.

[0016] SEQ ID NO: 1 is the nucleotide sequence of Probe 1.

[0017] SEQ ID NO: 2 is the deduced amino acid sequence of Probe 1.

[0018] SEQ ID NO: 3 is the nucleotide sequence of Probe 2.

[0019] SEQ ID NO: 4 is the deduced amino acid sequence of Probe 2.

[0020] SEQ ID NO: 5 is the nucleotide sequence of Probe 3.

[0021] SEQ ID NO: 6 is the deduced amino acid sequence of Probe 3.

[0022] SEQ ID NO: 7 is the nucleotide sequence of Probe 4.

[0023] SEQ ID NO: 8 is the deduced amino acid sequence of Probe 4.

[0024] SEQ ID NO: 9 is the nucleotide sequence of Probe 5.

[0025] SEQ ID NO: 10 is the deduced amino acid sequence of Probe 5.

[0026] SEQ ID NO: 11 is the nucleotide sequence of Probe 6.

[0027] SEQ ID NO: 12 is the deduced amino acid sequence of Probe 6.

[0028] SEQ ID NO: 13 is the nucleotide sequence of Probe 7.

[0029] SEQ ID NO: 14 is the deduced amino acid sequence of Probe 7.

[0030] SEQ ID NO: 15 is the nucleotide sequence of Probe 8.

[0031] SEQ ID NO: 16 is the deduced amino acid sequence of Probe 8.

[0032] SEQ ID NO: 17 is the nucleotide sequence of Probe 9.

[0033] SEQ ID NO: 18 is the deduced amino acid sequence of Probe 9.

[0034] SEQ ID NO: 19 is the nucleotide sequence of Probe 10.

[0035] SEQ ID NO: 20 is the deduced amino acid sequence of Probe 10.

[0036] SEQ ID NO: 21 is the nucleotide sequence of Probe 11.

[0037] SEQ ID NO: 22 is the deduced amino acid sequence of Probe 11.

[0038] SEQ ID NO: 23 is the nucleotide sequence of Probe 12.

[0039] SEQ ID NO: 24 is the deduced amino acid sequence of Probe 12.

[0040] SEQ ID NO: 25 is the nucleotide sequence of the full-lengthacyltransacylase clone TAX2.

[0041] SEQ ID NO: 26 is the deduced amino acid sequence of thefull-length acyltransacylase clone TAX2.

[0042] SEQ ID NO: 27 is the nucleotide sequence of the full-lengthacyltransacylase clone TAX1.

[0043] SEQ ID NO: 28 is the deduced amino acid sequence of thefull-length acyltransacylase clone TAX1.

[0044] SEQ ID NO: 29 is the amino acid sequence of a transacylasepeptide fragment.

[0045] SEQ ID NO: 30 is the amino acid sequence of a transacylasepeptide fragment.

[0046] SEQ ID NO: 31 is the amino acid sequence of a transacylasepeptide fragment.

[0047] SEQ ID NO: 32 is the amino acid sequence of a transacylasepeptide fragment.

[0048] SEQ ID NO: 33 is the amino acid sequence of a transacylasepeptide fragment.

[0049] SEQ ID NO: 34 is the AT-FOR1 PCR primer.

[0050] SEQ ID NO: 35 is the AT-FOR2 PCR primer.

[0051] SEQ ID NO: 36 is the AT-FOR3 PCR primer.

[0052] SEQ ID NO: 37 is the AT-FOR4 PCR primer.

[0053] SEQ ID NO: 38 is the AT-REV1 PCR primer.

[0054] SEQ ID NO: 39 is an amino acid sequence variant that allowed forthe design of the AT-FOR3 PCR primer.

[0055] SEQ ID NO: 40 is an amino acid sequence variant that allowed forthe design of the AT-FOR4 PCR primer.

[0056] SEQ ID NO: 41 is a consensus amino acid sequence that allowed forthe design of the AT-REV1 PCR primer.

[0057] SEQ ID NO: 42 is a PCR primer, useful for identifyingtransacylases.

[0058] SEQ ID NO: 43 is a PCR primer, useful for identifyingtransacylases.

[0059] SEQ ID NO: 44 is the nucleotide sequence of the full-lengthacyltransacylase clone TAX6.

[0060] SEQ ID NO: 45 is the deduced amino acid sequence of thefull-length acyltransacylase clone TAX6.

[0061] SEQ ID NO: 46 is a PCR primer, useful for identifying TAX6.

[0062] SEQ ID NO: 47 is a PCR primer, useful for identifying TAX6.

[0063] SEQ ID NO: 48 is a 6-amino acid motif commonly found intransacylases.

[0064] SEQ ID NO: 49 is the nucleotide sequence of the full-lengthacyltransacylase clone TAX5.

[0065] SEQ ID NO: 50 is the deduced amino acid sequence of thefull-length acyltransacylase clone TAX5.

[0066] SEQ ID NO: 51 is the nucleotide sequence of the full-lengthacyltransacylase clone TAX7.

[0067] SEQ ID NO: 52 is the deduced amino acid sequence of thefull-length acyltransacylase clone TAX7.

[0068] SEQ ID NO: 53 is the nucleotide sequence of the full-lengthacyltransacylase clone TAX10.

[0069] SEQ ID NO: 54 is the deduced amino acid sequence of thefull-length acyltransacylase clone TAX 10.

[0070] SEQ ID NO: 55 is the nucleotide sequence of the full-lengthacyltransacylase clone TAX12.

[0071] SEQ ID NO: 56 is the deduced amino acid sequence of thefull-length acyltransacylase clone TAX12.

[0072] SEQ ID NO: 57 is the nucleotide sequence of the full-lengthacyltransacylase clone TAX13.

[0073] SEQ ID NO: 58 is the deduced amino acid sequence of thefull-length acyltransacylase clone TAXI13.

FIGURES

[0074]FIG. 1: Enzymatic reactions of the Taxol™ pathway indicatingcyclization of geranylgeranyl diphosphate to taxa-4(5),11(12)-diene,followed by hydroxylation/rearrangement and acetylation totaxa-4(20),11(12)-dien-5α-yl acetate. The acetate is further convertedto 10-deacetylbaccatin III, baccatin III, and Taxol™. In the figure, “a”denotes taxadiene synthase; “b” denotes taxadiene-5α-hydroxylase; “c”denotes taxadien-5α-ol acetyl transacylase; and “d” denotes severalsubsequent steps.

[0075]FIG. 2: Peptide sequences generated by endolysC and trypsinproteolysis of purified taxadienol acetyl transacylase.

[0076]FIG. 3: Panel A is an elution profile of the acetyl transacylaseon Source HR 15Q (10×100 mm) preparative scale anion-exchangechromatography; Panel B is an elution profile on analytical scale SourceHR 15Q (5×50 mm) column chromatography; and Panel C is an elutionprofile on the ceramic hydroxyapatite column. The solid line is the UVabsorbance at 280 nm; the dotted line is the relative transacetylaseactivity (dpm); and the hatched line is the elution gradient (sodiumchloride or sodium phosphate). Panel D is a photograph of asilver-stained 12% SDS-PAGE showing the purity of taxadien-5α-ol acetyltransacylase (50 kDa) after hydroxyapatite chromatography. A minorcontaminant is present at ˜35 kDa.

[0077]FIG. 4 shows four forward (AT-FOR1, AT-FOR2, AT-FOR3, AT-FOR4) andone reverse (AT-REV1) degenerate primers that were used to amplify aninduced Taxus cell library cDNA from which twelve hybridization probeswere obtained. Inosine positions are indicated by “I”. Each of theforward primers was paired with the reverse primer in separate PCRreactions. Primers AT-FOR1 (SEQ ID NO: 34) and AT-FOR2 (SEQ ID NO: 35)were designed from the tryptic fragment SEQ ID NO: 30; the remainingprimers were derived by database searching based on SEQ ID NO: 30.

[0078]FIG. 5 shows data obtained from a coupled gas chromatographic-massspectrometric (GC-MS) analysis of the biosynthetic taxadien-5α-ylacetate formed during the incubation of taxadien-5α-ol with solubleenzyme extracts from isopropyl β-D-thiogalactoside (IPTG)-induced E.coli JM109 cells transformed with full-length acyltransacylase clonesTAX1 and TAX2. Panels A and B show the respective GC and MS profiles ofauthentic taxadien-5α-ol; panels C and D show the respective GC and MSprofiles of authentic taxadien-5α-yl acetate; panel E shows the GCprofile of taxadien-5α-ol (11.16 minutes), taxadien-5α-yl acetate (11.82minutes), dehydrated taxadien-5α-ol (“TOH-H₂O” peak), and a contaminant,bis-(2-ethylhexyl)phthlate (“BEHP” peak, a plasticizer, CAS 117-81-7,extracted from buffer) after incubation of taxadien-5α-ol and acetylcoenzyme A with the soluble enzyme fraction derived from E. coli JM109transformed with the full-length clone TAX1. Panel F shows the massspectrum of biosynthetically formed taxadien-5α-yl acetate by therecombinant enzyme (11.82 minute peak in GC profile Panel E); panel Gshows the GC profile of the products generated from taxadien-5α-ol andacetyl coenzyme A by incubation with the soluble enzyme fraction derivedfrom E. coli JM109 cells transformed with the full-length clone TAX2(note the absence of taxadien-5α-yl acetate indicating that this cloneis inactive in the transacylase reaction).

[0079]FIG. 6: Pileup of deduced amino acid sequences listed in Table 1,and of TAX1 and TAX2. Residues boxed in black (and gray) indicate thefew regions of conservation. Forward arrow (left to right) showsconserved region from which degenerate forward PCR primers weredesigned. Reverse arrow (right to left) shows region from which thereverse PCR primer was designed (cf., FIG. 4).

[0080]FIG. 7: Dendrogram showing deduced peptide sequence relationshipsbetween Taxus transacylase sequences (Probes 1-12, TAX1, and TAX2) andclosest relative sequences of defined and unknown function obtained fromthe GenBank database described in Table 1.

[0081]FIG. 8: Panel A shows the outline of the Taxol™ biosyntheticpathway. The cyclization of geranylgeranyl diphosphate to taxadiene bytaxadiene synthase, and the hydroxylation to taxadien-5α-ol by taxadiene5α-hydroxylase (a), the acetylation of taxadien-5α-ol bytaxa-4(20),11(12)-dien-5α-ol-O-acetyl transferase (b), the conversion of10-deacetylbaccatin III to baccatin III by 10-deacetylbaccatinIII-10-O-acetyl transferase (c), and the side chain attachment tobaccatin III to form Taxol™ (d) are highlighted. The broken arrowindicates several as yet undefined steps. Panel B shows a postulatedbiosynthetic scheme for the formation of the oxetane, present in Taxol™and related late-stage taxoids, in which the 4(20)-ene-5α-ol isconverted to the 4(20)-ene-5α-yl acetate followed by epoxidation to the4(20)-epoxy-5α-acetoxy group and then intramolecular rearrangement tothe 4-acetoxy oxetane moiety.

[0082]FIG. 9: Radio-HPLC (high-performance liquid chromatography)analysis of the biosynthetic product (Rt=7.0±0.1 minutes) generated from10-deacetylbaccatin III and [2-³H]acetyl CoA by the recombinant acetyltransferase. The top trace shows the UV profile and the bottom traceshows the coincident radioactivity profile, both of which coincide withthe retention time of authentic baccatin III. For the enzymepreparation, E. coli cells transformed with the pCWori+ vector harboringthe putative DBAT gene were grown overnight at 37° C. in 5 mLLuria-Bertani medium supplemented with ampicillin, and 1 mL of thisinoculum was added to and grown in 100 mL Terrific Broth culture medium(6 g bacto-tryptone, Difco Laboratories, Spark, Md., 12 g yeast extract,EM Science, Cherryhill, N.J., and 2 mL gycerol in 500 mL water)supplemented with 1 mM IPTG, 1 mM thiamine HCl and 50 μg ampicillin/mL.After 24 hours, the bacteria were harvested by centrifugation,resuspended in 20 mL of assay buffer (25 mM Mopso, pH 7.4) and thendisrupted by sonication at 0-4° C. The resulting homogenate wascentrifuged at 15,000 g to remove debris, and a 1 mL aliquot of thesupernatant was incubated with 10-deacetylbaccatin III (400 μM) and[2-³H]acetyl coenzyme A (0.45 μCi, 400 μM) for 1 hour at 31° C. Thereaction mixture was then extracted with ether and the solventconcentrated in vacuo. The crude product (pooled from five such assays)was purified by silica gel thin-layer chromatography (TLC; 70:30 ethylacetate: hexane). The band co-migrating with authentic baccatin III(Rf−0.45 for the standard) was isolated and analyzed by radio-HPLC toreveal the new radioactive product described above. Extracts of E. colitransformed with empty vector controls did not yield detectable productwhen assayed by identical methods.

[0083]FIG. 10: combined reverse-phase HPLC-chemical ionization MS (massspectrometry) analysis of (spectrum A) the biosynthetic product(Rt=8.6±0.1 minutes) generated by recombinant acetyl transferase with10-deaceylbaccatin III and acetyl CoA as co-substrates, and of (spectrumB) authentic baccatin III (Rt=8.6±0.1 minutes). The diagnostic massspectral fragments are at m/z 605 (M+NH₄ ⁺), 587 (MH⁺), 572 (MH⁺—CH₃),527 (MH⁺—CH₃COOH), and 509 (MH⁺—(CH₃COOH+H₂O)). For preparation ofrecombinant enzyme and product isolation, see FIG. 8 legend.

DETAILED DESCRIPTION

[0084] Definitions

[0085] Mammal: This term includes both humans and non-human mammals.Similarly, the term “patient” includes both humans and veterinarysubjects.

[0086] Taxoid: A “taxoid” is a chemical based on the Taxane ringstructure as described in Kinston et al., Progress in the Chemistry ofOrganic Natural Products, Springer-Verlag, 1993.

[0087] Isolated: An “isolated” biological component (such as a nucleicacid or protein or organelle) is a component that has been substantiallyseparated or purified away from other biological components in the cellof the organism in which the component naturally occurs, i.e., otherchromosomal and extra-chromosomal DNA, RNA, proteins, and organelles.Nucleic acids and proteins that have been “isolated” include nucleicacids and proteins purified by standard purification methods. The termalso embraces nucleic acids and proteins prepared by recombinantexpression in a host cell, as well as chemically synthesized nucleicacids.

[0088] Orthologs: An “ortholog” is a gene that encodes a protein thatdisplays a function that is similar to a gene derived from a differentspecies.

[0089] Homologs: “Homologs” are two nucleotide sequences that share acommon ancestral sequence and diverged when a species carrying thatancestral sequence split into two species.

[0090] Purified: The term “purified” does not require absolute purity;rather, it is intended as a relative term. Thus, for example, a purifiedenzyme or nucleic acid preparation is one in which the subject proteinor nucleotide, respectively, is at a higher concentration than theprotein or nucleotide would be in its natural environment within anorganism. For example, a preparation of an enzyme can be considered aspurified if the enzyme content in the preparation represents at least50% of the total protein content of the preparation.

[0091] Vector: A “vector” is a nucleic acid molecule as introduced intoa host cell, thereby producing a transformed host cell. A vector mayinclude nucleic acid sequences, such as an origin of replication, thatpermit the vector to replicate in a host cell. A vector may also includeone or more screenable markers, selectable markers, or reporter genesand other genetic elements known in the art.

[0092] Transformed: A “transformed” cell is a cell into which a nucleicacid molecule has been introduced by molecular biology techniques. Asused herein, the term “transformation” encompasses all techniques bywhich a nucleic acid molecule might be introduced into such a cell,including transfection with a viral vector, transformation with aplasmid vector, and introduction of naked DNA by electroporation,lipofection, and particle gun acceleration.

[0093] DNA construct: The term “DNA construct” is intended to indicateany nucleic acid molecule of cDNA, genomic DNA, synthetic DNA, or RNAorigin. The term “construct” is intended to indicate a nucleic acidsegment that may be single- or double-stranded, and that may be based ona complete or partial naturally occurring nucleotide sequence encodingone or more of the transacylase genes of the present invention. It isunderstood that such nucleotide sequences include intentionallymanipulated nucleotide sequences, e.g., subjected to site-directedmutagenesis, and sequences that are degenerate as a result of thegenetic code. All degenerate nucleotide sequences are included withinthe scope of the invention so long as the transacylase encoded by thenucleotide sequence maintains transacylase activity as described below.

[0094] Recombinant: A “recombinant” nucleic acid is one having asequence that is not naturally occurring in the organism in which it isexpressed, or has a sequence made by an artificial combination of twootherwise-separated, shorter sequences. This artificial combination isoften accomplished by chemical synthesis or, more commonly, by theartificial manipulation of isolated segments of nucleic acids, e.g., bygenetic engineering techniques. “Recombinant” is also used to describenucleic acid molecules that have been artificially manipulated, butcontain the same control sequences and coding regions that are found inthe organism from which the gene was isolated.

[0095] Specific binding agent: A “specific binding agent” is an agentthat is capable of specifically binding to the transacylases of thepresent invention, and may include polyclonal antibodies, monoclonalantibodies (including humanized monoclonal antibodies) and fragments ofmonoclonal antibodies such as Fab, F(ab′)2 and Fv fragments, as well asany other agent capable of specifically binding to the epitopes on theproteins.

[0096] cDNA (complementary DNA): A “cDNA” is a piece of DNA lackinginternal, non-coding segments (introns) and regulatory sequences thatdetermine transcription. cDNA is synthesized in the laboratory byreverse transcription from messenger RNA extracted from cells.

[0097] ORF (open reading frame): An “ORF” is a series of nucleotidetriplets (codons) coding for amino acids without any termination codons.These sequences are usually translatable into respective polypeptides.

[0098] Operably linked: A first nucleic acid sequence is “operablylinked” with a second nucleic acid sequence whenever the first nucleicacid sequence is placed in a functional relationship with the secondnucleic acid sequence. For instance, a promoter is operably linked to acoding sequence if the promoter affects the transcription or expressionof the coding sequence. Generally, operably linked DNA sequences arecontiguous and, where necessary to join two protein-coding regions, inthe same reading frame.

[0099] Probes and primers: Nucleic acid probes and primers may beprepared readily based on the amino acid sequences and nucleic acidsequences provided by this invention. A “probe” comprises an isolatednucleic acid attached to a detectable label or reporter molecule.Typical labels include radioactive isotopes, ligands, chemiluminescentagents, and enzymes. Methods for labeling and guidance in the choice oflabels appropriate for various purposes are discussed in, e.g., Sambrooket al. (ed.), Molecular Cloning: A Laboratory Manual 2nd ed., vol. 1-3,cold Spring Harbor Laboratory Press, cold Spring Harbor, N.Y., 1989, andAusubel et al. (ed.) Current Protocols in Molecular Biology, GreenePublishing and Wiley-Interscience, New York (with periodic updates),1987.

[0100] “Primers” are short nucleic acids, preferably DNAoligonucleotides 10 nucleotides or more in length. A primer may beannealed to a complementary target DNA strand by nucleic acidhybridization to form a hybrid between the primer and the target DNAstrand, and then extended along the target DNA strand by a DNApolymerase enzyme. Primer pairs can be used for amplification of anucleic acid sequence, e.g., by the polymerase chain reaction (PCR), orother nucleic-acid amplification methods known in the art.

[0101] Methods for preparing and using probes and primers are described,for example, in references such as Sambrook et al. (ed.), MolecularCloning: A Laboratory Manual, 2nd ed., vol. 1-3, cold Spring HarborLaboratory Press, cold Spring Harbor, N.Y., 1989; Ausubel et al. (ed.),Current Protocols in Molecular Biology, Greene Publishing andWiley-Interscience, New York (with periodic updates), 1987; and Innis etal., PCR Protocols: A Guide to Methods and Applications, Academic Press:San Diego, 1990. PCR primer pairs can be derived from a known sequence,for example, by using computer programs intended for that purpose suchas Primer (Version 0.5, © 1991, Whitehead Institute for BiomedicalResearch, Cambridge, Mass.). One of skill in the art will appreciatethat the specificity of a particular probe or primer increases with thelength of the probe or primer. Thus, for example, a primer comprising 20consecutive nucleotides will anneal to a target having a higherspecificity than a corresponding primer of only 15 nucleotides. Thus, inorder to obtain greater specificity, probes and primers may be selectedthat comprise, for example, 10, 20, 25, 30, 35, 40, 50 or moreconsecutive nucleotides.

[0102] Sequence identity: The similarity between two nucleic acidsequences or between two amino acid sequences is expressed in terms ofthe level of sequence identity shared between the sequences. Sequenceidentity is typically expressed in terms of percentage identity; thehigher the percentage, the more similar the two sequences.

[0103] Methods for aligning sequences for comparison are well known inthe art. Various programs and alignment algorithms are described in:Smith & Waterman, Adv. Appl. Math. 2:482, 1981; Needleman & Wunsch, J.Mol. Biol. 48:443, 1970; Pearson & Lipman, Proc. Natl. Acad. Sci. USA85:2444, 1988; Higgins & Sharp, Gene 73:237-244, 1988; Higgins & Sharp,CABIOS 5:151-153, 1989; Corpet et al., Nucleic Acids Research16:10881-10890, 1988; Huang, et al., CABIOS 8:155-165, 1992; and Pearsonet al., Methods in Molecular Biology 24:307-331, 1994. Altschul et al.,J. Mol. Biol. 215:403-410, 1990, presents a detailed consideration ofsequence alignment methods and homology calculations.

[0104] The National Center for Biotechnology Information (NCBI) BasicLocal Alignment Search Tool (BLAST™, Altschul et al. J. Mol. Biol.215:403-410, 1990) is available from several sources, including theNational Center for Biotechnology Information (NCBI, Bethesda, Md.) andon the Internet, for use in connection with the sequence-analysisprograms blastp, blastn, blastx, tblastn and tblastx. BLAST™ can beaccessed on the internet at http://www.ncbi.nlm.nih.gov/BLAST/. Adescription of how 15 to determine sequence identity using this programis available on the internet athttp://www.ncbi.nlm.nih.gov/BLAST/blast_help.html.

[0105] For comparisons of amino acid sequences of greater than about 30amino acids, the “Blast 2 sequences” function of the BLAST™ program isemployed using the default BLOSUM62 matrix set to default parameters,(gap existence cost of 11, and a per residue gap cost of 1). Whenaligning short peptides (fewer than around 30 amino acids), thealignment should be performed using the Blast 2 sequences function,employing the PAM30 matrix set to default parameters (open gap 9,extension gap 1 penalties). Proteins with even greater similarity to thereference sequences will show increasing percentage identities whenassessed by this method, such as at least 45%, at least 50%, at least60%, 25 at least 80%, at least 85%, at least 90%, or at least 95%sequence identity.

[0106] Transacylase (an older name for acyltransferase) activity:Enzymes exhibiting transacylase activity are capable of transferringacyl groups, forming either esters or amides, by catalyzing reactions inwhich an acyl group that is linked to a carrier (acyl-carrier) istransferred to a reactant, thus forming an acyl group linked to thereactant (acyl-reactant).

[0107] Transacylases: Transacylases are enzymes that displaytransacylase activity as described supra. However, all transacylases donot recognize the same carriers and reactants. Therefore, transacylaseenzyme-activity assays must utilize different substrates and reactantsdepending on the specificity of the particular transacylase enzyme. Oneof ordinary skill in the art will appreciate that the assay describedbelow is a representative example of a transacylase activity assay, andthat similar assays can be used to test transacylase activity directedtowards different substrates and reactants.

[0108] Substantial similarity: A first nucleic acid is “substantiallysimilar” to a second nucleic acid if, when optimally aligned (withappropriate nucleotide deletions or gap insertions) with the othernucleic acid (or its complementary strand), there is nucleotide sequenceidentity in at least about, for example, 50%, 75%, 80%, 85%, 90% or 95%of the nucleotide bases. Sequence similarity can be determined bycomparing the nucleotide sequences of two nucleic acids using the BLAST™sequence analysis software (blastn) available from The National Centerfor Biotechnology Information. Such comparisons may be made using thesoftware set to default settings (expect=10, filter=default,descriptions=500 pairwise, alignments=500, alignment view=standard, gapexistence cost=11, per residue existence=1, per residue gap cost=0.85).Similarly, a first polypeptide is substantially similar to a secondpolypeptide if they show sequence identity of at least about 75%-90% orgreater when optimally aligned and compared using BLAST software(blastp) using default settings.

[0109] II. Characterization of acetyl CoA:taxa-4(20),11(12)-dien-5α-olO-acetyl transacylase

[0110] A. Enzyme Purification and Library construction

[0111] Biochemical studies have indicated that the third specificintermediate of the Taxol™ biosynthesis pathway is taxa-4(20),11(12)-dien-5∀-yl acetate, because this metabolite serves as a precursorof a series of polyhydroxy taxanes en route to the end-product (Hezariand Croteau, Planta Medica 63:291-295, 1997). The responsible enzyme,taxadienol acetyl transacylase, that converts taxadienol to theC5-acetate ester is, thus, an important candidate for cDNA isolation forthe purpose of overexpression in relevant producing organisms toincrease Taxol™ yield (Walker et al., Arch. Biochem. Biophys.364:273-279, 1999).

[0112] This enzyme has been partially purified and characterized withrespect to reaction parameters (Walker et al., Arch. Biochem. Biophys.364:273-279, 1999); however, the published fractionation protocol doesnot yield a pure protein suitable for amino acid microsequencing that isrequired for an attempt at reverse genetic cloning of the gene. [It isalso important to note that the gene has no homologs or orthologs (i.e.,other terpenoid or isoprenoid O-acetyl transacylases) in the databasesto permit similarity-based cloning approaches.]

[0113] Using methyl jasmonate-induced Taxus canadensis cells as anenriched enzyme source, a new isolation and purification protocol (seeFIG. 3, and protocol described infra) was developed to efficiently yieldhomogeneous protein for microsequencing. Although the protein wasN-blocked and failed to yield peptides that could be internallysequenced by V8 (endoproteinase Glu-C, Roche Molecular Biochemical,Nutley, N.J.) proteolysis or cyanogen bromide (CNBr) cleavage, treatmentwith endolysC (endoproteinase Lys-C, Roche Molecular Biochemical,Nutley, N.J. and trypsin yielded a mixture of peptides. Five of thesecould be separated by high-performance liquid chromatography (HPLC) andverified by mass spectrometry (MS), and yielded sequence informationuseful for a cloning effort (FIG. 2).

[0114] For cDNA library construction, a stable, methyljasmonate-inducible T. cuspidata suspension cell line was chosen formRNA isolation because the production of Taxol™ was highly inducible inthis system (which permits the preparation of a suitable subtractivelibrary, if necessary). The mixing of experimental protocols as usedwith different Taxus species is not a significant limitation, since allTaxus species are known to be very closely related and are considered byseveral taxonomists to represent geographic variants of the basicspecies T. baccata (Bolsinger and Jaramillo, Silvics of Forest Trees ofNorth America (revised), Pacific Northwest Research Station, USDA, p.17, Portland, Oreg., 1990; and Voliotis, Isr. J. Botany. 35:47-52,1986). Thus, the genes encoding geranylgeranyl diphosphate synthase andtaxadiene synthase (early steps of Taxol™ biosynthesis) from T.canadensis and T. cuspidata evidence only very minor sequencedifferences. Hence, a method was developed for the isolation ofhigh-quality mRNA from Taxus cells (Qiagen, Valencia, Calif.) and thismaterial was employed for cDNA library construction using a commercialkit which is available from Stratagene, La Jolla, Calif.

[0115] B. Reverse Genetic Cloning

[0116] Of the five tryptic peptides that were sequenced (FIG. 2),peptide SEQ ID NOs: 30, 31, and 33 were found to exhibit some similarityto the sequences of the only two other plant acetyl transacylases thathave been documented, namely, deacetylvindoline O-acetyl transacylaseinvolved in indole alkaloid biosynthesis (St. Pierre et al., Plant J.14:703-713, 1998) and benzyl alcohol O-acetyl transacylase involved inthe biosynthesis of aromatic esters of floral scent (Dudareva et al.,Plant J. 14:297-304, 1998). Lesser resemblance was found to a putativearomatic O-benzoyl transacylase of plant origin (Yang et al., Plant MolBiol. 35:777-789, 1997). Of the five peptide sequences (FIG. 2), SEQ IDNO: 30 was most suitable for primer design based on codon degeneracyconsiderations, and two such forward degenerate primers, AT-FOR1 (SEQ IDNO: 34) and AT-FOR2 (SEQ ID NO: 35), were synthesized (FIG. 4). A searchof the database with the tryptic peptide ILVYYPPFAGR (SEQ ID NO: 30)revealed two possible variants of this sequence among several geneentries of known and unknown function (these entries are listed in Table1). consideration of these distantly related sequences allowed thedesign of two additional forward degenerate primers (AT-FOR3 (SEQ ID NO:36) and AT-FOR4 (SEQ ID NO: 37)), and permitted identification of adistal consensus sequence from which a degenerate reverse primer(AT-REV1 (SEQ ID NO: 38)) was designed (FIG. 4). (An alignment of theTaxus sequences with the extant database sequence entries of Table 1illustrates the lack of significant homology between the Taxus sequencesand any previously described genes.) TABLE 1 Database (GenBank)sequences used for peptide comparisons. For alignment, see FIG. 6; forplacement in dendrogram, see FIG. 7. The accession number is followed bya two-letter code indicating genus and species (AT, Arabidopsisthaliana; CM, Cucumis melo; CR, Catharanthus roseus; DC, Dianthuscaryophyllus; CB, Clarkia breweri; NT, Nicotiana tabacum). ProteinIdentification Accession No. No. Function AC000103_AT g2213627 unknown;from genomic sequence for Arabidopsis thaliana BAC F21J9 AC000103_ATg2213628 unknown; from genomic sequence for A. thaliana BAC F21J9AF002109_AT g2088651 unknown; hypersensitivity-related gene 201 isologAC002560_AT g2809263 unknown; from genomic sequence for A. thaliana BACF21B7 AC002986_AT g3152598 unknown; similarity to C2-HC type zinc fingerprotein C.e-My/T1 gb/U67079 from C. elegans and to hypersensitivity-related gene 201 isolog T28M21.14 from A. thaliana BAC AC002392_ATg3176709 putative anthranilate N-hydroxycinnamoyl/benzoyltransferaseAL031369_AT g3482975 unknown; putative protein Z84383_AT g2239083hydroxycinnamoyl:benzoyl-CoA: anthranilate N-hydroxycinnamoyl: benzoyltransferase Z97338_AT g2244896 unknown; similar to HSR201 protein N.tabacum Z97338_AT g2244897 unknown; hypothetical protein AL049607_ATg4584530 unknown; putative protein AF043464_CB g3170250 acetylCoA:benzylalcohol acetyl transferase Z70521_CM g1843440 unknown;expressed during ripening of melon (Cucumis melo L.) fruits AF053307_CRg4091808 deacetylvindoline 4-O-acetyl transferase AC004512_DC g3335350unknown; similar to gb/Z84386 anthranilateN-hydroxycinnamoyl/benzoyltransferase from Dianthus caryophyllusX95343_NT g1171577 unknown; hypersensitive reaction in tobacco

[0117] PCR amplifications were performed using each combination offorward and reverse primers, and induced Taxus cell library cDNA as atarget. The amplifications produced, by cloning and sequencing, twelverelated but distinct amplicons (each ca. 900 bp) having origins from thevarious primers (Table 2). These amplicons are designated “Probe 1”through “Probe 12,” and their nucleotide and deduced amino acidsequences are listed as SEQ ID NOs: 1-24, respectively. TABLE 2 Primercombinations, amplicons and acquired genes. The parentheses and bracketsare used to designate the primer pair used and the correspondingfrequency at which that primer pair amplified the probe. Amplicon SizeAcquired Gene Primer Pair (bp) Frequency Designation DesignationFunction AT-FOR1/AT-REV1 920 7/12 Probe 1 TAX1 (full-length) taxadienol(AT-FOR2/AT-REV1) (12/31) SEQ ID NO: 27; SEQ ID NO: acetyl 28transferase (FIG. 4) SEQ ID NO: 1; SEQ ID NO: 2 TAX2 (full-length)unknown SEQ ID NO: 25; SEQ ID NO: 26 AT-FOR1/AT-REV1 920 7/12 Probe 2Probe 2 was not used, but — (AT-FOR2/AT-Rev1) (2/31) likely would haveacquired TAX2 because the sequence corresponds directly to this gene.(FIG. 4) SEQ ID NO: 3; SEQ ID NO: 4 AT-FOR4/AT-REV1 903 2/29 Probe 3 — —(FIG. 4) SEQ ID NO: 5; SEQ ID NO: 6 AT-FOR3/AT-REV1 908 1/29 Probe 4 — —(FIG. 4) SEQ ID NO: 7; SEQ ID NO: 8 — — AT-FOR4/AT-REV1 908 1/32 Probe 5TAX5 (full-length) unknown SEQ ID NO: 49; SEQ ID NO: 50 (FIG. 4) SEQ IDNO: 9; SEQ ID NO: 10 — — AT-FOR2/AT-REV1 911 8/32 Probe 6 TAX6(full-length) 10- (AT-FOR3/AT-REV1) (1/29) SEQ ID NO: 44; Seq. ID No: 45deacetylbaccatin [AT-FOR4/AT-REV1] [1/32] III-10-O-acetyl transferase(FIG. 4) SEQ ID NO: 11; SEQ ID NO: 12 — — AT-FOR3/AT-REV1 968 6/29 Probe7 TAX7 (full-length) unknown SEQ ID NO: 51; SEQ ID NO: 52 (FIG. 4) SEQID NO: 13; SEQ ID NO: 14 — — AT-FOR3/AT-REV1 908 1/29 Probe 8 — —(AT-FOR4/AT-REV1) (2/32) (FIG. 4) SEQ ID NO: 15; SEQ ID NO: 16 — —AT-FOR2/AT-REV1 908 1/32 Probe 9 — — (AT-FOR3/AT-REV1) (5/29) (FIG. 4)SEQ ID NO: 17; SEQ ID NO: 18 — — AT-FOR4/AT-REV1 911 2/32 Probe 10 TAX10(full-length) unknown SEQ ID NO: 53; SEQ ID NO: 54 (FIG. 4) SEQ ID NO:19; SEQ ID NO: 20 — — AT-FOR4/AT-REV1 920 1/32 Probe 11 — — (FIG. 4) SEQID NO: 21; SEQ ID NO: 22 — — AT-FOR3/AT-REV1 908 3/29 Probe 12 TAX12(full-length) unknown (AT-FOR4/AT-REV1) (1/32) SEQ ID NO: 55; SEQ ID NO:56 (FIG. 4) SEQ ID NO: 23; SEQ ID NO: 24 — — TAX13 does not appear todirectly TAX13 (full-length) unknown correspond to any of the above SEQID NO: 57; SEQ ID NO: listed Probes 58

[0118] Notably, Probe 1, derived from the primers AT-FOR1 (SEQ ID NO:34) and AT-REV1 (SEQ ID NO: 38), amplified a ˜900 bp DNA fragmentencoding, with near identity, the proteolytic peptides corresponding toSEQ ID NOs: 31-33 of the purified protein. These results suggested thatthe amplicon Probe 1 represented the target gene for taxadienol acetyltransacylase. Probe 1 was then ³²P-labeled and employed as ahybridization probe in a screen of the methyl jasmonate-induced T.cuspidata suspension cell λZAP II™ cDNA library. Standard hybridizationand purification procedures ultimately led to the isolation of threefull-length, unique clones designated TAX1, TAX2, and TAX6 (SEQ ID NOS:27, 25, and 44, respectively).

[0119] C. Sequence Analysis and Functional Expression

[0120] Clone TAX1 bears an open reading frame of 1317 nucleotides (nt;SEQ ID NO: 27)) and encodes a deduced protein of 439 amino acids (aa;SEQ ID NO: 28) with a calculated molecular weight of 49,079 kDa. CloneTAX2 bears an open reading frame of 1320 nt (SEQ ID NO:25) and encodes adeduced protein of 440 aa (SEQ ID NO:26) with a calculated molecularweight of 50,089 kDa. Clone TAX6 bears an open reading frame of 1320 nt(SEQ ID NO: 44) and encodes a deduced protein of 440 aa (SEQ ID NO: 45)with a calculated molecular weight of 49,000 kDa.

[0121] The sizes of TAX1 and TAX2 are consistent with the molecularweight of the native taxadienol transacetylase (MW˜50,000) determined bygel-permeation chromatography (Walker et al., Arch. Biochem. Biophys.364:273-279, 1999) and SDS polyacrylamide gel electrophoresis(SDS-PAGE). The deduced amino acid sequences of both TAX1 and TAX2 alsoremotely resemble those of other acetyl transacylases (50-56% identity;64-67% similarity) involved in different pathways of secondarymetabolism in plants (St. Pierre et al., Plant J. 14:703-713, 1998; andDudareva et al., Plant J. 14:297-304, 1998). When compared to the aminoacid sequence information from the tryptic peptide fragments, TAX1exhibited a very close match (91% identity), whereas TAX2 exhibitedconservative differences (70% identity).

[0122] The TAX6 calculated molecular weight of 49,052 kDa is consistentwith that of the native TAX6 protein (˜50 kDa), determined by gelpermeation chromatography, indicating the protein to be a functionalmonomer, and is very similar to the size of the related, monomerictaxadien-5α-ol transacetylase (MW=49,079). The acetyl CoA:10-deacetylbacctin III-10-O-acetyl transferase from Taxus cuspidataappears to be substantially different in size from the acetyl CoA:10-hydroxytaxane-O-acetyl transferase recently isolated from Taxuschinensis and reported at a molecular weight of 71,000 (Menhard andZenk, Phytochemistry 50:763-774, 1999).

[0123] The deduced amino acid sequence of TAX6 resembles that of TAX1(64% identity; 80% similarity) and those of other acetyl transferases(56-57% identity; 65-67 % similarity) involved in different pathways ofsecondary metabolism in plants (Dudareva et al., Plant J. 14:297-304,1998; St-Pierre et al., Plant J. 14:703-713, 1998). Additionally, TAX6possesses the HXXXDG (SEQ ID NO: 48) (residues H162, D166, and G167,respectively) motif found in other acyl transferases (Brown et al., J.Biol. Chem. 269:19157-19162, 1994; Carbini and Hersh, J. Neurochem.61:247-253, 1993; Hendle et al., Biochemistry 34:4287-4298, 1995; andLewendon et al., Biochemistry 33:1944-1950, 1994); this sequence elementhas been suggested to function in acyl group transfer from acyl CoA tothe substrate alcohol (St. Pierre et al., Plant J. 14:703-713, 1998).

[0124] To determine the identity of the putative taxadienol acetyltransacylase, TAX1, TAX2, and TAX6 were subcloned in-frame into theexpression vector pCWori+ (Barnes, Methods Enzymol. 272:3-14, 1996) andexpressed in E. coli JM109 cells. The transformed bacteria were culturedand induced with isopropyl ∃-D-thiogalactoside (IPTG), and cell-freeextracts were prepared and evaluated for taxadienol acetyl transacylaseactivity using the previously developed assay procedures (Walker et al.,Arch. Biochem. Biophys. 364:273-279, 1999). Clone TAX1 (correspondingdirectly to Probe 1) expressed high levels of taxadienol acetyltransacylase activity (20% conversion of substrate to product), asdetermined by radiochemical analysis; the product of this recombinantenzyme was confirmed as taxadienyl-5∀-yl acetate by gaschromatography-mass spectrometry (GC-MS) (FIG. 5). Clone TAX2 did notexpress taxadienol acetyl transacylase activity and was inactive withthe [³H]taxadienol and acetyl CoA co-substrates. However, the clone TAX2may encode an enzyme for a step later in the Taxol™ biosynthetic pathway(TAX2 has been shown to correspond to Probe 2). Neither of therecombinant proteins expressed from TAX1 or TAX2 was capable ofacetylating the advanced Taxol™ precursor 10-deacetyl baccatin III tobaccatin III. Thus, based on the demonstration of functionally expressedactivity, and the resemblance of the recombinant enzyme in substratespecificity and other physical and chemical properties to the nativeform, clone TAX1 was confirmed to encode the Taxus taxadienol acetyltransacylase.

[0125] Additionally, the heterologously expressed TAX6 was partiallypurified by anion-exchange chromatography (O-diethylaminoethylcellulose,Whatman, Clifton, N.J.) and ultrafiltration (Amicon Diaflo YM 10membrane, Millipore, Bedford, Mass.) to remove interfering hydrolasesfrom the bacterial extract, and the recombinant enzyme was determined tocatalyze the conversion of 10-deacetylbaccatin III to baccatin III; thelatter is the last diterpene intermediate in the Taxol™ (paclitaxel)biosynthetic pathway. The optimum pH for TAX6 was determined to be 7.5,with half-maximal velocities at pH 6.4 and 7.8. The K_(m) values for10-deacetylbaccatin III and acetyl CoA were determined to be 10 μM and 8μM, respectively, by Lineweaver-Burk analysis (for both plots R²=0.97).These kinetic constants for TAX6 are comparable to thetaxa-4(20),11(12)-dien-5α-ol acetyl transferase possessing K_(m) valuesfor taxadienol and acetyl CoA of 4 μM and 6 μM, respectively. The TAX6appears to acetylate the 10-hydroxyl group of taxoids with a high degreeof regioselectivity, since the enzyme does not acetylate the 1β-, 7β-,or 13α-hydroxyl groups of 10-deacetylbaccatin III, nor does it acetylatethe 5α-hydroxyl group of taxa-4(20),11(12)-dien-5α-ol.

[0126] III. Other Transacylases of the Taxol™ Pathway

[0127] The protocol described above yielded twelve related amplicons.Initial use of the first and second amplicons as probes for screeningthe cDNA library allowed for the isolation and characterization oftaxadienol 5-O-acetyl transacylase. In addition to this first confirmedtaxadienol 5-O-acetyl transacylase (TAX1), there are at least fouradditional transacylation steps in the Taxol™ biosynthetic pathwayrepresented by the 2-debenzoyl baccatin III-2-O-benzoyl transacylase,the 10-deacetylbaccatin III-10-O-acetyl transacylase, the baccatinIII-13-O-phenylisoserinyl transacylase, and the debenzoyltaxol-N-benzoyltransacylase. The close relationship between the nucleic acid sequencesof the twelve amplicons indicates that the remaining amplicon sequencesrepresent partial nucleic acid sequences of the other transacylases inthe Taxol™ pathway. Hence, the above-described protocol enablesfull-length versions of these Taxol™ transacylases to be obtained. Thefollowing discussion relating to Taxol™ transacylases refers totaxadienol 5-O-acetyl transacylase, as well as the remainingtransacylases of the Taxol™ pathway. Furthermore, one of skill in theart will appreciate that the remaining transacylases can be testedeasily for enzymatic activity using functional assays with theappropriate taxoid substrates, see for example the assay for taxoid C10transacylase described in Menhard and Zenk, Phytochemistry 50:763-774,1999.

[0128] IV. Isolating a Gene Encoding acetylCoA:taxa-4(20),11(12)-dien-5α-ol O-acetyl transacylase

[0129] A. Experimental Overview

[0130] A newly designed isolation and purification method is describedbelow for the preparation of homogeneous taxadien-5∀-ol acetyltransacylase from Taxus canadensis. The purified protein wasN-terminally blocked, thereby requiring internal amino acidmicrosequencing of fragments generated by proteolytic digestion. Peptidefragments so generated were purified by HPLC and sequenced, and onesuitable sequence was used to design a set of degenerate PCR primers.Several primer combinations were employed to amplify a series of twelverelated, gene-specific DNA sequences (Probes 1-12). Nine of thesegene-specific sequences were used as hybridization probes to screen aninduced Taxus cuspidata cell cDNA library. This strategy allowed for thesuccessful isolation of eight full-length transacylase cDNA clones. Theidentity of one of these clones was confirmed by sequence matching tothe peptide fragments described above and by heterologous functionalexpression of transacylase activity in Escherichia coli.

[0131] B. Culture of Cells

[0132] Initiation, propagation and induction of Taxus sp. cell cultures,reagents, procedures for the synthesis of substrates and standards, andgeneral methods for transacylase isolation, characterization and assayhave been previously described (Hefner et al., Arch. Biochem. Biophys.360:62-75, 1998; and Walker et al., Arch. Biochem. Biophys. 364:273-279,1999). Since all designated Taxus species are considered to be closelyrelated subspecies (Bolsinger and Jaramillo, Silvics of Forest Trees ofNorth America (revised), Pacific Northwest Research Station, USDA,Portland, Oreg., 1990; and Voliotis, Isr. J. Botany 35:47-52, 1986), theTaxus cell sources were chosen for operational considerations becauseonly minor sequence differences and/or allelic variants between proteinsand genes of the various “species” were expected. Thus, Taxus canadensiscells were chosen as the source of transacetylase because they expresstransacetylase at high levels, and Taxus cuspidata cells were selectedfor cDNA library construction because they produce Taxol™ at highlevels.

[0133] C. Isolation and Purification of the Enzyme

[0134] No related terpenol transacylase genes are available in thedatabases (see below) to permit homology-based cloning. Hence, aprotein-based (reverse genetic) approach to cloning the targettransacetylase was required. This reverse genetic approach requiredobtaining a partial amino acid sequence, generating degenerate primers,amplifying a portion of cDNA using PCR, and using the amplified fragmentas a probe to detect the correct clone in a cDNA library.

[0135] Unfortunately, the previously described partial proteinpurification protocol, including an affinity chromatography step, didnot yield pure protein for amino acid microsequencing, nor did theprotocol yield protein in useful amounts, or provide a sufficientlysimplified SDS-PAGE banding pattern to allow assignment of thetransacetylase activity to a specific protein (Walker et al., Arch.Biochem. Biophys. 364:273-279, 1999). Furthermore, numerous variationson the affinity chromatography step, as well as the earlier anionexchange and hydrophobic interaction chromatography steps, failed toimprove the specific activity of the preparations due to the instabilityof the enzyme upon manipulation. Also, a five-fold increase in the scaleof the preparation resulted in only marginally improved recovery(generally <5% total yield accompanied by removal of >99% of totalstarting protein). Furthermore, because the enzyme could not be purifiedto homogeneity, and attempts to improve stability by the addition ofpolyols (sucrose, glycerol), reducing agents (Na₂S₂O₅, ascorbate,dithiothreitol, ∃-mercaptoethanol), and other proteins (albumin, casein)were also not productive (Walker et al., Arch. Biochem. Biophys.364:273-279, 1999), this approach had to be abandoned.

[0136] To overcome the problem described above, the following isolationand purification procedure was used. The purity of the taxadienol acetyltransacylase after each fractionation step was assessed by SDS-PAGEaccording to Laemmli (Laemmli, Nature 227:680-685, 1970); quantificationof total protein after each purification step was carried out by themethod of Bradford, Analytical Biochem. 72:248-254, 1976, or byCoommassie Blue staining, and transacylase activity was assessed usingthe methods described in Walker et al., Arch. Biochem. Biophys.364:273-279, 1999.

[0137] Procedures for protein staining have been described (Wray et al.,Anal Biochem. 118:197-203, 1991). The preparation of the T. canadensiscell-free extracts and all subsequent procedures were performed at 0-4°C. unless otherwise noted. Cells (40 g batches) were frozen in liquidnitrogen and thoroughly pulverized for 1.5 minutes using a mortar andpestle. The resulting frozen powder was transferred to 225 mL of icecold 30 mM HEPES buffer (pH 7.4) containing 3 mM dithiothreitol (DTT),XAD-4 polystyrene resin (12 g) and polyvinylpolypyrrolidone (PVPP, 12 g)to adsorb low molecular weight resinous and phenolic compounds. Theslurry was slowly stirred for 30 minutes, and the mixture was filteredthrough four layers of cheese cloth to remove solid absorbents andparticulates. The filtrate was centrifuged at 7000 g for 30 minutes toremove cellular debris, then at 100,000 g for 3 hours, followed by0.2-μm filtration to yield a soluble protein fraction (in ˜200 mLbuffer) used as the enzyme source.

[0138] The soluble enzyme fraction was subjected to ultrafiltration(DIAFLO™ YM 30 membrane, Millipore, Bedford, Mass.) to concentrate thefraction from 200 mL to 40 mL and to selectively remove proteins ofmolecular weight lower than the taxadien-5∀-ol acetyl transacylase(previously established at 50,000 Da in Walker et al., Arch. Biochem.Biophys. 364:273-279, 1999). Using a peristaltic pump, the concentrate(40 mL) was applied (2 mL/minute) to a column ofO-diethylaminoethylcellulose (2.8×10 cm, Whatman DE-52, Fairfield, N.J.)that had been equilibrated with “equilibration buffer” (30 mM HEPESbuffer (pH 7.4) containing 3 mM DTT). After washing with 60 mL ofequilibration buffer to remove unbound material, the proteins wereeluted with a step gradient of the same buffer containing 50 mM (25 mL),125 mM (50 mL), and 200 mM (50 mL) NaCl.

[0139] The fractions were assayed as described previously (Walker etal., Arch. Biochem. Biophys. 364:273-279, 1999), and those containingtaxadien-5∀-ol acetyl transacylase activity (125-mM and 200-mMfractions) were combined (100 mL,˜160 mM) and diluted to 5 mM NaCl (160mL) by ultrafiltration (DIAFLO™ YM 30 membrane, Millipore, Bedford,Mass.) and repeated dilution with 30 mM HEPES buffer (pH 7.4) containing3 mM DTT.

[0140] Further purification was effected by high-resolutionanion-exchange and hydroxyapatite chromatography run on a Pharmacia FPLCsystem coupled to a 280-nm effluent detector. The preparation describedabove was applied to a preparative anion-exchange column (10×100 mm,Source 15Q, Pharmacia Biotech., Piscataway, N.J.) that was previouslywashed with “wash buffer” (30 mM HEPES buffer (pH 7.4) containing 3 mMDTT) and 1 M NaCl, and then equilibrated with wash buffer (withoutNaCl). After removing unbound material, the applied protein was elutedwith a linear gradient of 0 to 200 mM NaCl in equilibration buffer (215mL total volume; 3 mL/minute) (see FIG. 3A). Fractions containingtransacetylase activity (eluting at ˜80 mM NaCl) were combined anddiluted to 5 mM NaCl by ultrafiltration using 30 mM HEPES buffer (pH7.4) containing 3 mM DTT as diluent, as described above. The desaltedprotein sample (70 mL) was loaded onto an analytical anion-exchangecolumn (5×50 mm, Source 15Q, Pharmacia Biotech., Piscataway, N.J.) thatwas washed and equilibrated as before. The column was developed using ashallow, linear salt gradient with elution to 200 mM NaCL (275 mL totalvolume, 1.5 mL/minute, 3.0 mL fractions). The taxadienol acetyltransacylase eluted at ˜55-60 mM NaCl (see FIG. 3B), and the appropriatefractions were combined (15 mL), reconstituted to 45 mL in 30 mM HEPESbuffer (pH 6.9) and applied to a ceramic hydroxyapatite column (10×100mm, Bio-Rad Laboratories, Hercules, Calif.) that was previously washedwith 200 mM sodium phosphate buffer (pH 6.9) and then equilibrated withan “equilibration buffer” (30 mM HEPES buffer (pH 6.9) containing 3 mMDTT (without sodium phosphate)). The equilibration buffer was used todesorb weakly associated material, and the bound protein was eluted by agradient from 0 to 40 mM sodium phosphate in equilibration buffer (125mL total volume, at 3.0 mL/minute, 3.0 mL fractions) (see FIG. 3C). Thefractions containing the highest activity, eluting over 27 mL at 10 mMsodium phosphate, were combined and shown by SDS-PAGE to yield a proteinof ˜95% purity (a minor contaminant was present at ˜35 kDa, see FIG.3D). The level of transacylase activity was measured after each step inthe isolation and purification protocol described above. The level ofactivity recovered is shown in Table 3. TABLE 3 Summary oftaxadien-5α-ol O-acetyl transferase purification from Taxus cells. Totalactivity Total Specific Activity Purification (pkat) Protein (mg)(pkat/mg protein) (fold) Crude extract 302 1230 0.25 1 YM30 136 98 1.45.6 ultrafiltration DE-52 122 69 1.8 7.2 YM30 54 55 1.0 4ultrafiltration Source 15Q 47 3 16 63 (10 × 100 mm) YM30 19 2.6 7.3 29ultrafiltration Source 15Q 13 0.12 108 400 (5 × 50 mm) Hydroxyapatite 100.05 200 800

[0141] D. Amino Acid Microsequencing of Taxadienol Acetyl Transacylase

[0142] The purified protein from multiple preparations as describedabove (>95% pure, ˜100 pmol, 50 μg) was subjected to preparativeSDS-PAGE (Laemmli, Nature 227:680-685, 1970). The protein band at 50kDa, corresponding to the taxadienol acetyl transacylase, was excised.Whereas treatment with V8 protease or treatment with cyanogen bromide(CNBr) failed to yield sequencable peptides, in situ proteolysis withendolysC (Caltech Sequence/Structure Analysis Facility, Pasadena,Calif.) and trypsin (Fernandez et al., Anal. Biochem. 218:112-118, 1994)yielded a number of peptides, as determined by HPLC, and several ofthese were separated, verified by mass spectrometry (Fernandez et al.,Electrophoresis 19:1036-1045, 1998), and subjected to Edman degradativesequencing, from which five distinct and unique amino acid sequences(designated SEQ ID NOs: 29-33) were obtained (FIG. 2).

[0143] E. cDNA Library construction and Related Manipulations

[0144] A cDNA library was constructed from mRNA isolated from T.cuspidata suspension culture cells that had been induced to maximalTaxol™ production with methyl jasmonate for 16 hours. An optimizedprotocol for the isolation of total RNA from T. cuspidata cells wasdeveloped empirically using a buffer containing 100 mM Tri-HCl (pH 7.5),4 M guanidine thiocyanate, 25 mM EDTA and 14 mM ∃-mercaptoethanol. Cells(1.5 g) were disrupted at 0-4° C. using a Polytron™ ultrasonicator(Kinematica AG, Switzerland; 4×15 second bursts at power setting 7), theresulting homogenate was adjusted to 2% (v/v) Triton X-100 and allowedto stand 15 minutes on ice. An equal volume of 3 M sodium acetate (pH6.0) was then added, and the mixed solution was incubated on ice for anadditional 15 minutes, followed by centrifugation at 15,000 g for 30minutes at 4° C. The resulting supernatant was mixed with 0.8 volume ofisopropanol and allowed to stand on ice for 5 minutes, followed bycentrifugation at 15,000 g for 30 minutes at 4° C. The resulting pelletwas dissolved in 8 mL of 20 mM Tris-HCl (pH 8.0) containing 1 mM EDTA,adjusted to pH 7.0 by addition of 2 mL of 2 M NaCl in 250 mM MOPS buffer(pH 7.0), and total RNA was recovered by passing this solution over anucleic acid isolation column (Qiagen, Valencia, Calif.) following themanufacturer's instructions. Poly(A)+ mRNA was then purified from totalRNA by chromatography on oligo(dT) beads (Oligotex™ mRNA Kit, Qiagen),and this material was used to construct a library using the λZAPII™ cDNAsynthesis kit and Gigapack™ III gold packaging kit from Stratagene, LaJolla, Calif., by following the manufacturer's instructions.

[0145] Unless otherwise stated, standard methods were used for DNAmanipulations and cloning (Sambrook et al. (ed.), Molecular Cloning. ALaboratory Manual 2nd ed., vol. 1-3, Cold Spring Harbor LaboratoryPress, Cold Spring Harbor, N.Y., 1989), and for PCR amplificationprocedures (Innis et al., PCR Protocols: A Guide to Methods andApplications, Academic Press, New York, 1990). DNA was sequenced usingAmplitaq™ (Hoffmann-La Roche INC., Nutley, N.J.) DNA polymerase andcycle sequencing (fluorescence sequencing) on an ABI Prism™ 373 DNASequencer. The E. coli strains XL1-Blue and XL1-Blue MRF′ (Stratagene,La Jolla, Calif.) were used for routine cloning of PCR products and forcDNA library construction, respectively. E. coli XL1 -Blue MRF′ cellswere used for in vivo excision of purified pBluescript SK from positiveplaques and the excised plasmids were used to transform E. coli SOLRcells.

[0146] F. Degenerate Primer Design and PCR Amplification

[0147] Due to codon degeneracy, only one sequence of the five trypticpeptide fragments obtained (SEQ ID NO: 30 of FIG. 2) was suitable forPCR primer construction. Two such degenerate forward primers, designatedAT-FOR1 (SEQ ID NO: 34) and AT-FOR2 (SEQ ID NO: 35), were designed basedon this sequence (FIG. 4). Using the NCBI Blast 2.0 database searchingprogram (Genetics computer Group, Program Manual for the WisconsinPackage, version 9, Genetics computer Group, 575 Science Drive, Madison,Wis., 1994) to search for this sequence element among the few definedtransacylases of plant origin (St. Pierre et al., Plant J. 14:703-713,1998; Dudareva et al., Plant J. 14:297-304, 1998; and Yang et al., PlantMol. Bio. 35:777-789, 1997), and the many deposited sequences of unknownfunction, allowed the identification of two possible sequence variantsof this element (FYPFAGR (SEQ ID NO: 39) and YYPLAGR (SEQ ID NO: 40))from which two additional degenerate forward primers, designated AT-FOR3(SEQ ID NO: 36) and AT-FOR4 (SEQ ID NO: 37), were designed (FIG. 4). Thesequences employed for this comparison are listed in Table 1. Using thisrange of functionally defined and undefined sequences, conserved regionswere sought for the purpose of designing a degenerate reverse primer(the distinct lack of similarity of the Taxus sequences to genes in thedatabase can be appreciated by reference to FIG. 6), from which one suchconsensus sequence element (DFGWGKP) (SEQ ID NO: 41) was noted, and wasemployed for the design of the reverse primer AT-REV1 (SEQ ID NO: 38)(FIG. 4). This set of four forward primers and one reverse primerincorporated a varied number of inosines, and ranged from 72- to216-fold degeneracy. The remaining four proteolytic peptide fragmentsequences (SEQ ID NO: 29, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33 ofFIG. 2) were not only less suitable for primer design, but they were notfound (by NCBI BLAST™ searching) to be similar to other relatedsequences, thus suggesting that these represented more specific sequenceelements of the Taxus transacetylase gene.

[0148] Each forward primer (150 μM) and the reverse primer (150 μM) wereused in separate PCR reactions performed with Taq polymerase (3 U/100 μLreaction containing 2 mM MgCl₂) and employing the induced T. cuspidatacell cDNA library (10⁸ PFU) as template under the following conditions:94° C. for 5 minutes, 32 cycles at 94° C. for 1 minute, 40° C. for 1minute and 74° C. for 2 minutes and, finally, 74° C. for 5 minutes. Theresulting amplicons (regions amplified by the various primercombinations) were analyzed by agarose gel electrophoresis (Sambrook etal. (ed.), Molecular Cloning: A Laboratory Manual 2nd ed., vol. 1-3,Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989) andthe products were extracted from the gel, ligated into pCR TOPOT7(Invitrogen, Carlsbad, Calif.), and transformed into E. coli TOPIOF′cells (Invitrogen, Carlsbad, Calif.). Plasmid DNA was prepared fromindividual transformants and the inserts were fully sequenced.

[0149] The combination of primers AT-FOR1 (SEQ ID NO: 34) and AT-REV1(SEQ ID NO: 38) yielded a 900-bp amplicon. Cloning and sequencing of theamplicon revealed two unique sequences designated “Probe 1” (SEQ IDNO: 1) and “Probe 2” (SEQ ID NO: 3) (Table 2). The results with theremaining primer combinations are provided in Table 2.

[0150] G. Library screening

[0151] Four separate library-screening experiments were designed usingvarious combinations of the radio-labeled amplicons (Probes 1-12,described supra) as probes. Use of radio-labeled Probe 1 (SEQ ID NO: 1),led to the identification of TAX1 (SEQ ID NO: 27) and TAX2 (SEQ ID NO:25), and use of radio-labeled Probe 6 (SEQ ID NO: 11) led to theidentification of TAX6 (SEQ ID NO: 44). A probe consisting of a mixtureof radio-labeled Probe 10 (SEQ ID NO: 19) and Probe 12 (SEQ ID NO: 23)led to the identification of TAX10 (SEQ ID NO: 44) and TAX12 (SEQ ID NO:55). Finally, a probe containing a mixture of radio-labeled Probes 3, 4,5, 7, and 9 led to the identification of TAX5, TAX 7, and TAX13 (SEQ IDNOs. 49, 51, and 57, respectively). Details of these individuallibrary-screening experiments are provided below.

[0152] The identification of TAX1 (SEQ ID NO: 27) and TAX2 (SEQ ID NO:25) was accomplished using 1 μg of Probe 1 (SEQ ID NO: 1) that had beenamplified by PCR, the resulting amplicon was gel-purified, randomlylabeled with [∀-³²P]CTP (Feinberg and Vogelstein, Anal. Biochem.137:216-217, 1984), and used as a hybridization probe to screen membranelifts of 5×10⁵ plaques grown in E. coli XL1 -Blue MRF′. Phage DNA wascross-linked to the nylon membranes by autoclaving on fast cycle 3-4minutes at 120° C. After cooling, the membranes were washed 5 minutes in2×SSC, then 5 minutes in 6×SSC (containing 0.5% SDS, 5×Denhardt'sreagent, 0.5 g Ficoll (Type 400, Pharmacia, Piscataway, N.J.), 0.5 gpolyvinylpyrrolidone (PVP-10), and 0.5 g bovine serum albumin (FractionV, Sigma, Saint Louis, Mo.) in 100 mL total volume). Hybridization wasthen performed for 20 hours at 68° C. in 6×SSC, 0.5% SDS and5×Denhardt's reagent. The nylon membranes were then washed two times for5 minutes in 2×SSC with 0.1% SDS at 25° C., and then washed 2×30 minuteswith 1×SSC and 0.1% SDS at 68° C. After washing, the membranes wereexposed for 17 hours to Kodak (Rochester, N.Y.) XAR film at −70° C.(Sambrook et al. (ed.), Molecular Cloning: A Laboratory Manual, 2nd ed.,vol. 1-3, cold Spring Harbor Laboratory Press, cold Spring Harbor, N.Y.,1989).

[0153] Of the plaques exhibiting positive signals (˜600 total), 60 werepurified through two additional rounds of hybridization. Purified λZAPIIclones were excised in vivo as pBluescript II SK(−) phagemids andtransformed into E. coli SOLR cells (Stratagene, La Jolla, Calif.). Thesize of each cDNA insert was determined by PCR using T3 and T7 promoterprimers, and size-selected inserts (>1.5 kb) were partially sequencedfrom both ends to sort into unique sequence types and to acquirefull-length versions of each (by further screening with a newly designed5′-probe, if necessary).

[0154] The same basic screening protocol, as illustrated by the resultsprovided below, can be repeated with all of the probes described inTable 2, with the goal of acquiring the full range of full-length,in-frame putative transacylase clones for test of function by expressionin E coli. In the case of Probe 1 (SEQ ID NO: 1), two unique full-lengthclones, designated TAX1 (SEQ ID NO: 27 and SEQ ID NO: 28) and TAX2 (SEQID NO: 25 and SEQ ID NO: 26), were isolated.

[0155] An additional transacylase, TAX6 (SEQ ID NO: 44), was identifiedby using 40 ng of radio-labeled Probe 6 (SEQ ID NO: 11) to screen the T.cuspidata library. This full-length clone was 99% identical to Probe 6(SEQ ID NO: 11) and 99% identical to the deduced amino acid sequence ofProbe 6 (SEQ ID NO: 12), indicating that the probe had located itscognate.

[0156] Using 40 ng of radio-labeled Probe 10 (SEQ ID NO: 19) and 40 ngof radio-labeled Probe 12 (SEQ ID NO: 23) led to the identification ofthe full-length transacylases TAX10 (SEQ ID NO: 53 and SEQ ID NO:54) andTAX12 (SEQ ID NO:55 and SEQ ID NO: 56) in separate hybridizationscreening experiments.

[0157] Use of a probe mixture containing about 6 ng each of Probes 3, 4,5, 7, and 8 (SEQ ID NOs. 5, 7, 9, 13, and 15, respectively) randomlylabeled with [(α-³²P]CTP (Feinberg and Vogelstein, Anal. Biochem.137:216-217, 1984) resulted in the identification of full-lengthtransacylases TAX5 (SEQ ID NO: 49) and TAX7 (SEQ ID NO: 51), whichcorrespond to Probes 5 (SEQ ID NO: 9) and 7 (SEQ ID NO: 13),respectively. An additional full-length transacylase, TAX13 (SEQ ID NO:57) was also identified, however, this transacylase does not correspondto any of the Probes identified in Table 2.

[0158] H. cDNA Expression in E. coli

[0159] Full-length insert fragments of the relevant plasmids are excisedand subcloned in-frame into the expression vector pCWori+ (Barnes,Methods Enzymol. 272:3-14, 1996). This procedure may involve theelimination of internal restriction sites and the addition ofappropriate 5′- and 3′-restriction sites for directional ligation intothe expression vector using standard PCR protocols (Innis et al., PCRProtocols: A Guide to Methods and Applications, Academic Press: SanDiego, 1990) or commercial kits such as the Quick Change MutagenesisSystem (Stratagene, La Jolla, Calif.). For example, the full-lengthtransacylase corresponding to probe 6 (SEQ ID NO: 11) was obtained usingthe primer set (5′-GGGAATTCCATATGGCAGGCTCAACAGAATTTGTGG-3′ (SEQ ID NO:46) and 3′-GTTTATACATTGATTCGGAACTAGATCTGATC-5′ (SEQ ID NO: 47)) toamplify the putative full-length acetyl transferase gene and incorporateNdeI and XbaI restriction sites at the 5′- and 3′-termini, respectively,for directional ligation into vector pCWori+ (Barnes, Methods Enzymol.272:3-14, 1996). All recombinant pCWori+ plasmids are confirmed bysequencing to insure that no errors have been introduced by thepolymerase reactions, and are then transformed into E. coli JM109 bystandard methods.

[0160] Isolated transformants for each full-length insert are grown toA₆₀₀=0.5 at 37° C. in 50 mL Luria-Bertani medium supplemented with 50 μgampicillin/mL, and a 1-mL inoculum added to a large scale (100 mL)culture of Terrific Broth (6 g bacto-tryptone, DIFCO Laboratories,Spark, Md., 12 g yeast extract, EM Science, Cherryhill, N.J., and 2 mLglycerol in 500 mL water) containing 50 μg ampicillin/mL and thiamineHCl (320 mM) and grown at 28° C. for 24 hours. Approximately 24 hoursafter induction with 1 mM isopropyl ∃-D-thiogalactoside (IPTG), thebacterial cells are harvested by centrifugation, disrupted by sonicationin assay buffer consisting of 30 mM potassium phosphate (pH 7.4), or 25mM MOPSO (pH 7.4), followed by centrifugation to yield a soluble enzymepreparation that can be assayed for transacylase activity.

[0161] I. Enzyme assay

[0162] A specific assay for acetyl CoA:taxa-4(20),11(12)-dien-5∀-olO-acetyl transacylase has been described previously (Walker et al.,Arch. Biochem. Biophys. 364:273-279, 1999, herein incorporated byreference). Generally the assay for taxoid acyltransacylases involvesthe CoA-dependent acyl transfer from acetyl CoA (or other acyl or aroylCoA ester) to a taxane alcohol, and the isolation and chromatographicseparation of the product ester for confirmation of structure by GC-MS(or HPLC-MS) analysis. For another example of such an assay, see Menhardand Zenk, Phytochemistry 50:763-774, 1999.

[0163] The activity of TAX6 (SEQ ID NO: 45) was assayed under standardconditions described in Walker et al., Arch. Biochem. Biophys.364:273-279, 1999, with 10-deacetylbaccatin III (400 μM, Hauser ChemicalResearch Inc., Boulder, Colo.) and [2-³H]acetyl CoA (0.45 μCi, 400 μM(NEN, Boston, Mass.)) as co-substrates. The TAX6 (SEQ ID NO: 45) enzymepreparation yielded a single product from reversed-phase radio-HPLCanalysis, with a retention time of 7.0 minutes (coincident radio and UVtraces) corresponding exactly to that of authentic baccatin III(generously provided by Dr. David Bailey of Hauser Chemical ResearchInc., Boulder, Colo.) (FIG. 9). The identity of the biosynthetic productwas further verified as baccatin III by combined LC-MS (liquidchromatography-mass spectrometry) analysis (FIG. 10), which demonstratedthe identical retention time (8.6×0.1 minute) and mass spectrum for theproduct and authentic standard. Finally, a sample of the biosyntheticproduct, purified by silica gel analytical TLC, gave a ¹H-NMR spectrumidentical to that of authentic baccatin III, confirming the enzyme as10-deacetylbaccatin III-10-O-acetyl transferase (TAX6 (SEQ ID NO: 45))and also confirming that the corresponding gene had been isolated.

EXAMPLES

[0164] 1. Transacylase Protein and Nucleic Acid Sequences

[0165] As described above, the invention provides transacylases andtransacylase-specific nucleic acid sequences. With the provision hereinof these transacylase sequences, the polymerase chain reaction (PCR) maynow be utilized as a preferred method for identifying and producingnucleic acid sequences encoding the transacylases. For example, PCRamplification of the transacylase sequences may be accomplished eitherby direct PCR from a plant cDNA library or by Reverse-Transcription PCR(RT-PCR) using RNA extracted from plant cells as a template.Transacylase sequences may be amplified from plant genomic libraries, orplant genomic DNA. Methods and conditions for both direct PCR and RT-PCRare known in the art and are described in Innis et al., PCR Protocols: AGuide to Methods and Applications, Academic Press: San Diego, 1990.

[0166] The selection of PCR primers is made according to the portions ofthe cDNA (or gene) that are to be amplified. Primers may be chosen toamplify small segments of the cDNA, the open reading frame, the entirecDNA molecule or the entire gene sequence. Variations in amplificationconditions may be required to accommodate primers of differing lengths;such considerations are well known in the art and are discussed in Inniset al., PCR Protocols: A Guide to Methods and Applications, AcademicPress: San Diego, 1990; Sambrook et al. (ed.), Molecular Cloning: ALaboratory Manual 2nd ed., vol. 1-3, Cold Spring Harbor LaboratoryPress, Cold Spring Harbor, N.Y., 1989; and Ausubel et al. (ed.) CurrentProtocols in Molecular Biology, Greene Publishing andWiley-Interscience, New York (with periodic updates), 1987. By way ofexample, the cDNA molecules corresponding to additional transacylasesmay be amplified using primers directed towards regions of homologybetween the 5′ and 3′ ends of the TAX1 and TAX2 sequences. Exampleprimers for such a reaction are: primer 1: 5′ CCT CAT CTT TCC CCC ATTGAT AAT 3′ (SEQ ID NO:42) primer 2: 5′ AAA AAG AAA ATA ATT TTG CCA TGCAAG 3′ (SEQ ID NO:43)

[0167] These primers are illustrative only; it will be appreciated byone skilled in the art that many different primers may be derived fromthe provided nucleic acid sequences. Re-sequencing of PCR productsobtained by these amplification procedures is recommended to facilitateconfirmation of the amplified sequence and to provide information onnatural variation between transacylase sequences. Oligonucleotidesderived from the transacylase sequence may be used in such sequencingmethods.

[0168] Oligonucleotides that are derived from the transacylase sequencesare encompassed within the scope of the present invention. Preferably,such oligonucleotide primers comprise a sequence of at least 10-20consecutive nucleotides of the transacylase sequences. To enhanceamplification specificity, oligonucleotide primers comprising at least15, 20, 25, 30, 35, 40, 45 or 50 consecutive nucleotides of thesesequences may also be used.

[0169] A. Transacylases in Other Plant Species

[0170] Orthologs of the transacylase genes are present in a number ofother members of the Taxus genus. With the provision herein of thetransacylase nucleic acid sequences, the cloning by standard methods ofcDNAs and genes that encode transacylase orthologs in these otherspecies is now enabled. As described above, orthologs of the disclosedtransacylase genes have transacylase biological activity and aretypically characterized by possession of at least 50% sequence identitycounted over the full length alignment with the amino acid sequence ofthe disclosed transacylase sequences using the NCBI Blast 2.0 (gappedblastp set to default parameters). Proteins with even greater similarityto the reference sequences will show increasing percentage identitieswhen assessed by this method, such as at least 60%, at least 65%, atleast 70%, at least 75%, at least 80%, at least 90%, or at least 95%sequence identity.

[0171] Both conventional hybridization and PCR amplification proceduresmay be utilized to clone sequences encoding transacylase orthologs.Common to both of these techniques is the hybridization of probes orprimers that are derived from the transacylase nucleic acid sequences.Furthermore, the hybridization may occur in the context of Northernblots, Southern blots, or PCR.

[0172] Direct PCR amplification may be performed on cDNA or genomiclibraries prepared from any of various plant species, or RT-PCR may beperformed using mRNA extracted from plant cells using standard methods.PCR primers will comprise at least 10 consecutive nucleotides of thetransacylase sequences. One of skill in the art will appreciate thatsequence differences between the transacylase nucleic acid sequence andthe target nucleic acid to be amplified may result in loweramplification efficiencies. To compensate for this, longer PCR primersor lower annealing temperatures may be used during the amplificationcycle. Where lower annealing temperatures are used, sequential rounds ofamplification using nested primer pairs may be necessary to enhancespecificity.

[0173] For conventional hybridization techniques the hybridization probeis preferably conjugated with a detectable label such as a radioactivelabel, and the probe is preferably at least 10 nucleotides in length. Asis well known in the art, increasing the length of hybridization probestends to give enhanced specificity. The labeled probe derived from thetransacylase nucleic acid sequence may be hybridized to a plant cDNA orgenomic library and the hybridization signal detected using methodsknown in the art. The hybridizing colony or plaque (depending on thetype of library used) is then purified and the cloned sequence containedin that colony or plaque is isolated and characterized.

[0174] Orthologs of the transacylases alternatively may be obtained byimmunoscreening of an expression library. With the provision herein ofthe disclosed transacylase nucleic acid sequences, the enzymes may beexpressed and purified in a heterologous expression system (e.g., E.coli) and used to raise antibodies (monoclonal or polyclonal) specificfor transacylases. Antibodies may also be raised against syntheticpeptides derived from the transacylase amino acid sequence presentedherein. Methods of raising antibodies are well known in the art and aredescribed generally in Harlow and Lane, Antibodies, A Laboratory Manual,Cold Spring Harbor Press, Cold Spring, N.Y. 1988. Such antibodies canthen be used to screen an expression cDNA library produced from a plant.This screening will identify the transacylase ortholog. The selectedcDNAs can be confirmed by sequencing and enzyme activity assays.

[0175] B. Taxol™ Transacylase Variants

[0176] With the provision of the transacylase amino acid sequences (SEQID NOs: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 45, 50, 52,54, 56, and 58) and the corresponding cDNA (SEQ ID NOs: 1, 3, 5, 7, 9,11, 13, 15, 17, 19, 21, 23, 25, 27, 44, 49, 51, 53, 55, and 57),variants of these sequences can now be created.

[0177] Variant transacylases include proteins that differ in amino acidsequence from the transacylase sequences disclosed, but that retaintransacylase biological activity. Such proteins may be produced bymanipulating the nucleotide sequence encoding the transacylase usingstandard procedures such as site-directed mutagenesis or the polymerasechain reaction. The simplest modifications involve the substitution ofone or more amino acids for amino acids having similar biochemicalproperties. These so-called “conservative substitutions” are likely tohave minimal impact on the activity of the resultant protein. Table 4shows amino acids which may be substituted for an original amino acid ina protein and which are regarded as conservative substitutions. TABLE 4Original conservative Residue Substitutions ala ser arg lys asn gln; hisasp glu cys ser gln asn glu asp gly pro his asn; gln ile leu; val leuile; val lys arg; gln; glu met leu; ile phe met; leu; tyr ser thr thrser trp tyr tyr trp; phe val ile; leu

[0178] More substantial changes in enzymatic function or other featuresmay be obtained by selecting substitutions that are less conservativethan those in Table 4, i.e., selecting residues that differ moresignificantly in their effect on maintaining: (a) the structure of thepolypeptide backbone in the area of the substitution, for example, as asheet or helical conformation; (b) the charge or hydrophobicity of themolecule at the target site; or (c) the bulk of the side chain. Thesubstitutions which in general are expected to produce the greatestchanges in protein properties will be those in which: (a) a hydrophilicresidue, e.g., seryl or threonyl, is substituted for (or by) ahydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl oralanyl; (b) a cysteine or proline is substituted for (or by) any otherresidue; (c) a residue having an electropositive side chain, e.g.,lysyl, arginyl, or histidyl, is substituted for (or by) anelectronegative residue, e.g., glutamyl or aspartyl; or (d) a residuehaving a bulky side chain, e.g., phenylalanine, is substituted for (orby) one not having a side chain, e.g., glycine. The effects of theseamino acid substitutions or deletions or additions may be assessed fortransacylase derivatives by analyzing the ability of the derivativeproteins to catalyse the conversion of one Taxol™ precursor to anotherTaxol™ precursor.

[0179] Variant transacylase cDNA or genes may be produced by standardDNA mutagenesis techniques, for example, M13 primer mutagenesis. Detailsof these techniques are provided in Sambrook et al. (ed.), MolecularCloning: A Laboratory Manual 2nd ed., vol. 1-3, Cold Spring HarborLaboratory Press, Cold Spring Harbor, N.Y., 1989, Ch. 15. By the use ofsuch techniques, variants may be created that differ in minor ways fromthe transacylase cDNA or gene sequences, yet that still encode a proteinhaving transacylase biological activity. DNA molecules and nucleotidesequences that are derivatives of those specifically disclosed hereinand that differ from those disclosed by the deletion, addition, orsubstitution of nucleotides while still encoding a protein havingtransacylase biological activity are comprehended by this invention. Intheir simplest form, such variants may differ from the disclosedsequences by alteration of the coding region to fit the codon usage biasof the particular organism into which the molecule is to be introduced.

[0180] Alternatively, the coding region may be altered by takingadvantage of the degeneracy of the genetic code to alter the codingsequence in such a way that, while the nucleotide sequence issubstantially altered, it nevertheless encodes a protein having an aminoacid sequence identical or substantially similar to the disclosedtransacylase amino acid sequences. For example, the fifteenth amino acidresidue of the TAX2 (SEQ ID NO: 26) is alanine. This is encoded in theopen reading frame (ORF) by the nucleotide codon triplet GCG. Because ofthe degeneracy of the genetic code, three other nucleotide codontriplets—GCA, GCC, and GCT—also code for alanine. Thus, the nucleotidesequence of the ORF can be changed at this position to any of thesethree codons without affecting the amino acid composition of the encodedprotein or the characteristics of the protein. Based upon the degeneracyof the genetic code, variant DNA molecules may be derived from the cDNAand gene sequences disclosed herein using standard DNA mutagenesistechniques as described above, or by synthesis of DNA sequences. Thus,this invention also encompasses nucleic acid sequences that encode thetransacylase protein but that vary from the disclosed nucleic acidsequences by virtue of the degeneracy of the genetic code.

[0181] Variants of the transacylase may also be defined in terms oftheir sequence identity with the transacylase amino acid and nucleicacid sequences described supra. As described above, transacylases havetransacylase biological activity and share at least 60% sequenceidentity with the disclosed transacylase sequences. Nucleic acidsequences that encode such proteins may readily be determined simply byapplying the genetic code to the amino acid sequence of thetransacylase, and such nucleic acid molecules may be readily produced byassembling oligonucleotides corresponding to portions of the sequence.

[0182] As previously mentioned, another method of identifying variantsof the transacylases is nucleic acid hybridization. Nucleic acidmolecules that are derived from the transacylase cDNA and gene sequencesinclude molecules that hybridize under various conditions to thedisclosed Taxol™ transacylase nucleic acid molecules, or fragmentsthereof. Generally, hybridization conditions are classified intocategories, for example very high stringency, high stringency, and lowstringency. The conditions for probes that are about 600 base pairs ormore in length are provided below in three corresponding categories.Very High Stringency (detects sequences that share 90% sequenceidentity) Hybridization in 5× SSC at 65° C. 16 hours Wash twice in 2×SSC at room temp. 15 minutes each Wash twice in 0.5× SSC at 65° C. 20minutes each High Stringency (detects sequences that share 80% sequenceidentity or greater) Hybridization in 5× SSC at 65° C. 16 hours Washtwice in 2× SSC at room temp. 20 minutes each Wash once in 1× SSC at 55°C. 30 minutes each Low Stringency (detects sequences that share greaterthan 50% sequence identity) Hybridization in 6× SSC at room temp. 16hours Wash twice in 3× SSC at room temp. 20 minutes each (20-21° C.)

[0183] The sequences encoding the transacylases identified throughhybridization may be incorporated into transformation vectors andintroduced into host cells to produce transacylase.

[0184] 2. Introduction of Transacylases into Plants

[0185] After a cDNA (or gene) encoding a protein involved in thedetermination of a particular plant characteristic has been isolated,standard techniques may be used to express the cDNA in transgenic plantsin order to modify the particular plant characteristic. The basicapproach is to clone the cDNA into a transformation vector, such thatthe cDNA is operably linked to control sequences (e.g., a promoter)directing expression of the CDNA in plant cells. The transformationvector is then introduced into plant cells by any of various techniques(e.g., electroporation) and progeny plants containing the introducedCDNA are selected. Preferably all or part of the transformation vectorstably integrates into the genome of the plant cell. That part of thetransformation vector that integrates into the plant cell and thatcontains the introduced cDNA and associated sequences for controllingexpression (the introduced “transgene”) may be referred to as therecombinant expression cassette.

[0186] Selection of progeny plants containing the introduced transgenemay be made based upon the detection of an altered phenotype. Such aphenotype may result directly from the cDNA cloned into thetransformation vector or may be manifested as enhanced resistance to achemical agent (such as an antibiotic) as a result of the inclusion of adominant selectable marker gene incorporated into the transformationvector.

[0187] Successful examples of the modification of plant characteristicsby transformation with cloned cDNA sequences are replete in thetechnical and scientific literature.

[0188] Selected examples, which serve to illustrate the knowledge inthis field of technology include:

[0189] U.S. Pat. No. 5,571,706 (“Plant Virus Resistance Gene andMethods”)

[0190] U.S. Pat. No. 5,677,175 (“Plant Pathogen Induced Proteins”)

[0191] U.S. Pat. No. 5,510,471 (“Chimeric Gene for the Transformation ofPlants”)

[0192] U.S. Pat. No. 5,750,386 (“Pathogen-Resistant Transgenic Plants”)

[0193] U.S. Pat. No. 5,597,945 (“Plants Genetically Enhanced for DiseaseResistance”)

[0194] U.S. Pat. No. 5,589,615 (“Process for the Production ofTransgenic Plants with Increased Nutritional Value Via the Expression ofModified 2S Storage Albumins”)

[0195] U.S. Pat. No. 5,750,871 (“Transformation and Foreign GeneExpression in Brassica Species”)

[0196] U.S. Pat. No. 5,268,526 (“Overexpression of Phytochrome inTransgenic Plants”)

[0197] U.S. Pat. No. 5,262,316 (“Genetically Transformed Pepper Plantsand Methods for their Production”)

[0198] U.S. Pat. No. 5,569,831 (“Transgenic Tomato Plants with AlteredPolygalacturonase Isoforms”)

[0199] These examples include descriptions of transformation vectorselection, transformation techniques, and the construction of constructsdesigned to over-express the introduced cDNA. In light of the foregoingand the provision herein of the transacylase amino acid sequences andnucleic acid sequences, it is thus apparent that one of skill in the artwill be able to introduce the cDNAs, or homologous or derivative formsof these molecules, into plants in order to produce plants havingenhanced transacylase activity. Furthermore, the expression of one ormore transacylases in plants may give rise to plants having increasedproduction of Taxol™ and related compounds.

[0200] A. Vector construction, Choice of Promoters

[0201] A number of recombinant vectors suitable for stable transfectionof plant cells or for the establishment of transgenic plants have beendescribed including those described in Weissbach and Weissbach, Methodsfor Plant Molecular Biology, Academic Press, 1989; and Gelvin et al.,Plant and Molecular Biology Manual, Kluwer Academic Publishers, 1990.Typically, plant-transformation vectors include one or more cloned plantgenes (or cDNAs) under the transcriptional control of 5′- and3′-regulatory sequences and a dominant selectable marker. Such planttransformation vectors typically also contain a promoter regulatoryregion (e.g., a regulatory region controlling inducible or constitutive,environmentally or developmentally regulated, or cell- ortissue-specific expression), a transcription initiation start site, aribosome binding site, an RNA processing signal, a transcriptiontermination site, and/or a polyadenylation signal.

[0202] Examples of constitutive plant promoters that may be useful forexpressing the cDNA include: the cauliflower mosaic virus (CaMV) 35Spromoter, which confers constitutive, high-level expression in mostplant tissues (see, e.g., Odel et al., Nature 313:810, 1985; Dekeyser etal., Plant Cell 2:591, 1990; Terada and Shimamoto, Mol. Gen. Genet.220:389, 1990; and Benfey and Chua, Science 250:959-966, 1990); thenopaline synthase promoter (An et al., Plant Physiol. 88:547, 1988); andthe octopine synthase promoter (Fromm et al., Plant Cell 1:977, 1989).Agrobacterium-mediated transformation of Taxus species has beenaccomplished, and the resulting callus cultures have been shown toproduce Taxol™ (Han et al., Plant Science 95: 187-196, 1994). Therefore,it is likely that incorporation of one or more of the describedtransacylases under the influence of a strong promoter (like CaMVpromoter) would increase production yields of Taxol™ and related taxoidsin such transformed cells.

[0203] A variety of plant-gene promoters that are regulated in responseto environmental, hormonal, chemical, and/or developmental signals alsocan be used for expression of the cDNA in plant cells, includingpromoters regulated by: (a) heat (Callis et al., Plant Physiol. 88:965,1988; Ainley, et al., Plant Mol. Biol. 22:13-23, 1993; and Gilmartin etal., The Plant Cell 4:839-949, 1992); (b) light (e.g., the pea rbcS-3Apromoter, Kuhlemeier et al., Plant Cell 1:471, 1989, and the maize rbcSpromoter, Schaffner and Sheen, Plant Cell 3:997, 1991); (c) hormones,such as abscisic acid (Marcotte et al., Plant Cell 1:969, 1989); (d)wounding (e.g., wunI, Siebertz et al., Plant Cell 1:961, 1989); and (e)chemicals such as methyl jasmonate or salicylic acid (Gatz et al., Ann.Rev. Plant Physiol. Plant Mol. Biol. 48:9-108, 1997).

[0204] Alternatively, tissue-specific (root, leaf, flower, and seed, forexample) promoters (Carpenter et al., The Plant Cell 4:557-571, 1992;Denis et al., Plant Physiol. 101:1295-1304, 1993; Opperman et al.,Science 263:221-223, 1993; Stockhause et al., The Plant Cell 9:479-489,1997; Roshal et al., Embo. J. 6:1155, 1987; Schernthaner et al., Embo J.7:1249, 1988; and Bustos et al., Plant Cell 1:839, 1989) can be fused tothe coding sequence to obtain a particular expression in respectiveorgans.

[0205] Alternatively, the native transacylase gene promoters may beutilized. With the provision herein of the transacylase nucleic acidsequences, one of skill in the art will appreciate that standardmolecular biology techniques can be used to determine the correspondingpromoter sequences. One of skill in the art will also appreciate thatless than the entire promoter sequence may be used in order to obtaineffective promoter activity. The determination of whether a particularregion of this sequence confers effective promoter activity may readilybe ascertained by operably linking the selected sequence region to atransacylase cDNA (in conjunction with suitable 3′ regulatory region,such as the NOS 3′ regulatory region as discussed below) and determiningwhether the transacylase is expressed.

[0206] Plant-transformation vectors may also include RNA-processingsignals, for example, introns, that may be positioned upstream ordownstream of the ORF sequence in the transgene. In addition, theexpression vectors may also include additional regulatory sequences fromthe 3′-untranslated region of plant genes, e.g., a 3′-terminator regionto increase mRNA stability of the mRNA, such as the PI-II terminatorregion of potato or the octopine or nopaline synthase (NOS)3′-terminator regions. The native transacylase gene 3′-regulatorysequence may also be employed.

[0207] Finally, as noted above, plant-transformation vectors may alsoinclude dominant selectable marker genes to allow for the readyselection of transformants. Such genes include those encodingantibiotic-resistance genes (e.g., resistance to hygromycin, kanamycin,bleomycin, G418, streptomycin or spectinomycin) and herbicide-resistancegenes (e.g., phosphinothricin acetyltransacylase).

[0208] B. Arrangement of Taxol™ transacylase Sequence in a Vector

[0209] The particular arrangement of the transacylase sequence in thetransformation vector is selected according to the type of expression ofthe sequence that is desired.

[0210] In most instances, enhanced transacylase activity is desired, andthe transacylase ORF is operably linked to a constitutive high-levelpromoter such as the CaMV 35S promoter. As noted above, enhancedtransacylase activity may also be achieved by introducing into a plant atransformation vector containing a variant form of the transacylase cDNAor gene, for example a form that varies from the exact nucleotidesequence of the transacylase ORF, but that encodes a protein retainingtransacylase biological activity.

[0211] C. Transformation and Regeneration Techniques

[0212] Transformation and regeneration of both monocotyledonous anddicotyledonous plant cells are now routine, and the appropriatetransformation technique can be determined by the practitioner. Thechoice of method varies with the type of plant to be transformed; thoseskilled in the art will recognize the suitability of particular methodsfor given plant types. Suitable methods may include, but are not limitedto: electroporation of plant protoplasts; liposome-mediatedtransformation; polyethylene glycol (PEG) mediated transformation;transformation using viruses; micro-injection of plant cells;micro-projectile bombardment of plant cells; vacuum infiltration; andAgrobacterium tumefaciens (AT) mediated transformation. Typicalprocedures for transforming and regenerating plants are described in thepatent documents listed at the beginning of this section.

[0213] D. Selection of Transformed Plants

[0214] Following transformation and regeneration of plants with thetransformation vector, transformed plants can be selected using adominant selectable marker incorporated into the transformation vector.Typically, such a marker confers antibiotic resistance on the seedlingsof transformed plants, and selection of transformants can beaccomplished by exposing the seedlings to appropriate concentrations ofthe antibiotic.

[0215] After transformed plants are selected and grown to maturity, theycan be assayed using the methods described herein to assess productionlevels of Taxol™ and related compounds.

[0216] 3. Production of Recombinant Taxol™ transacylase in HeterologousExpression Systems

[0217] Various yeast strains and yeast-derived vectors are commonly usedfor the expression of heterologous proteins. For instance, Pichiapastoris expression systems, obtained from Invitrogen (Carlsbad,Calif.), may be used to practice the present invention. Such systemsinclude suitable Pichia pastoris strains, vectors, reagents,transformants, sequencing primers, and media. Available strains includeKM71H (a prototrophic strain), SMD1168H (a prototrophic strain), andSMD1168 (a pep4 mutant strain) (Invitrogen Product Catalogue, 1998,Invitrogen, Carlsbad, Calif.).

[0218] Non-yeast eukaryotic vectors may be used with equal facility forexpression of proteins encoded by modified nucleotides according to theinvention. Mammalian vector/host cell systems containing genetic andcellular control elements capable of carrying out transcription,translation, and post-translational modification are well known in theart. Examples of such systems are the well-known baculovirus system, theecdysone-inducible expression system that uses regulatory elements fromDrosophila melanogaster to allow control of gene expression, and thesindbis viral-expression system that allows high-level expression in avariety of mammalian cell lines, all of which are available fromInvitrogen, Carlsbad, Calif.

[0219] The cloned expression vector encoding one or more transacylasesmay be transformed into any of various cell types for expression of thecloned nucleotide. Many different types of cells may be used to expressmodified nucleic acid molecules. Examples include cells of yeasts,fungi, insects, mammals, and plants, including transformed andnon-transformed cells. For instance, common mammalian cells that couldbe used include HeLa cells, SW-527 cells (ATCC deposit #7940), WISHcells (ATCC deposit #CCL-25), Daudi cells (ATCC deposit #CCL-213),Mandin-Darby bovine kidney cells (ATCC deposit #CCL-22) and Chinesehamster ovary (CHO) cells (ATCC deposit #CRL-2092). common yeast cellsinclude Pichia pastoris (ATCC deposit #201178) and Saccharomycescerevisiae (ATCC deposit #46024). Insect cells include cells fromDrosophila melanogaster (ATCC deposit #CRL-10191), the cotton bollworm(ATCC deposit #CRL-9281), and Trichoplusia ni egg cell homoflagellates.Fish cells that may be used include those from rainbow trout (ATCCdeposit #CLL-55), salmon (ATCC deposit #CRL-1681), and zebrafish (ATCCdeposit #CRL-2147). Amphibian cells that may be used include those ofthe bullfrog, Rana castebelana (ATCC deposit #CLL-41). Reptile cellsthat may be used include those from Russell's viper (ATCC deposit#CCL-140). Plant cells that could be used include Chlamydomonas cells(ATCC deposit #30485), Arabidopsis cells (ATCC deposit #54069) andtomato plant cells (ATCC deposit #54003). Many of these cell types arecommonly used and are available from the ATCC as well as from commercialsuppliers such as Pharmacia (Uppsala, Sweden), and Invitrogen.

[0220] Expressed protein may be accumulated within a cell or may besecreted from the cell. Such expressed protein may then be collected andpurified. This protein may then be characterized for activity andstability and may be used to practice any of the various methodsaccording to the invention.

[0221] 4. Creation of Transacylase-Specific Binding Agents

[0222] Antibodies to the transacylase enzymes, and fragments thereof, ofthe present invention may be useful for purification of the enzymes. Theprovision of the transacylase sequences allows for the production ofspecific antibody-based binding agents to these enzymes.

[0223] Monoclonal or polyclonal antibodies may be produced to thetransacylases, portions of the transacylases, or variants thereof.Optimally, antibodies raised against epitopes on these antigens willspecifically detect the enzyme. That is, antibodies raised against thetransacylases would recognize and bind the transacylases, and would notsubstantially recognize or bind to other proteins. The determinationthat an antibody specifically binds to an antigen is made by any one ofa number of standard immunoassay methods; for instance, Westernblotting, Sambrook et al. (ed.), Molecular Cloning: A Laboratory Manual,2nd ed., vol. 1-3, Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y., 1989.

[0224] To determine that a given antibody preparation (such as apreparation produced in a mouse against TAX1) specifically detects thetransacylase by Western blotting, total cellular protein is extractedfrom cells and electrophoresed on an SDS-polyacrylamide gel. Theproteins are then transferred to a membrane (for example,nitrocellulose) by Western blotting, and the antibody preparation isincubated with the membrane. After washing the membrane to removenon-specifically bound antibodies, the presence of specifically boundantibodies is detected by the use of an anti-mouse antibody conjugatedto an enzyme such as alkaline phosphatase; application of5-bromo-4-chloro-3-indolyl phosphate/nitro blue tetrazolium results inthe production of a densely blue-colored compound by immuno-localizedalkaline phosphatase.

[0225] Antibodies that specifically detect a transacylase will, by thistechnique, be shown to bind substantially only the transacylase band(having a position on the gel determined by the molecular weight of thetransacylase). Non-specific binding of the antibody to other proteinsmay occur and may be detectable as a weaker signal on the Western blot(which can be quantified by automated radiography). The non-specificnature of this binding will be recognized by one skilled in the art bythe weak signal obtained on the Western blot relative to the strongprimary signal arising from the specific anti-transacylase binding.

[0226] Antibodies that specifically bind to transacylases belong to aclass of molecules that are referred to herein as “specific bindingagents.” Specific binding agents that are capable of specificallybinding to the transacylase of the present invention may includepolyclonal antibodies, monoclonal antibodies and fragments of monoclonalantibodies such as Fab, F(ab′)₂ and Fv fragments, as well as any otheragent capable of specifically binding to one or more epitopes on theproteins.

[0227] Substantially pure transacylase suitable for use as an immunogencan be isolated from transfected cells, transformed cells, or fromwild-type cells. Concentration of protein in the final preparation isadjusted, for example, by concentration on an Amicon filter device, tothe level of a few micrograms per milliliter. Alternatively, peptidefragments of a transacylase may be utilized as immunogens. Suchfragments may be chemically synthesized using standard methods, or maybe obtained by cleavage of the whole transacylase enzyme followed bypurification of the desired peptide fragments. Peptides as short asthree or four amino acids in length are immunogenic when presented to animmune system in the context of a Major Histocompatibility Complex (MHC)molecule, such as MHC class I or MHC class II. Accordingly, peptidescomprising at least 3 and preferably at least 4, 5, 6 or moreconsecutive amino acids of the disclosed transacylase amino acidsequences may be employed as immunogens for producing antibodies.

[0228] Because naturally occurring epitopes on proteins frequentlycomprise amino acid residues that are not adjacently arranged in thepeptide when the peptide sequence is viewed as a linear molecule, it maybe advantageous to utilize longer peptide fragments from thetransacylase amino acid sequences for producing antibodies. Thus, forexample, peptides that comprise at least 10, 15, 20, 25, or 30consecutive amino acid residues of the amino acid sequence may beemployed. Monoclonal or polyclonal antibodies to the intacttransacylase, or peptide fragments thereof may be prepared as describedbelow.

[0229] A. Monoclonal Antibody Production by Hybridoma Fusion

[0230] Monoclonal antibody to any of various epitopes of thetransacylase enzymes that are identified and isolated as describedherein can be prepared from murine hybridomas according to the classicmethod of Kohler & Milstein, Nature 256:495, 1975, or a derivativemethod thereof. Briefly, a mouse is repetitively inoculated with a fewmicrograms of the selected protein over a period of a few weeks. Themouse is then sacrificed, and the antibody-producing cells of the spleenisolated. The spleen cells are fused by means of polyethylene glycolwith mouse myeloma cells, and the excess unfused cells destroyed bygrowth of the system on selective media comprising aminopterin (HATmedia). The successfully fused cells are diluted and aliquots of thedilution placed in wells of a microtiter plate where growth of theculture is continued. Antibody-producing clones are identified bydetection of antibody in the supernatant fluid of the wells byimmunoassay procedures, such as ELISA, as originally described byEngvall, Enzymol. 70:419, 1980, or a derivative method thereof. Selectedpositive clones can be expanded and their monoclonal antibody productharvested for use. Detailed procedures for monoclonal antibodyproduction are described in Harlow & Lane, Antibodies, A LaboratoryManual, Cold Spring Harbor Laboratory, N.Y., 1988.

[0231] B. Polyclonal Antibody Production by Immunization

[0232] Polyclonal antiserum containing antibodies to heterogenousepitopes of a single protein can be prepared by immunizing suitableanimals with the expressed protein, which can be unmodified or modified,to enhance immunogenicity. Effective polyclonal antibody production isaffected by many factors related both to the antigen and the hostspecies. For example, small molecules tend to be less immunogenic thanother molecules and may require the use of carriers and an adjuvant.Also, host animals vary in response to site of inoculations and dose,with both inadequate or excessive doses of antigen resulting inlow-titer antisera. Small doses (ng level) of antigen administered atmultiple intradermal sites appear to be most reliable. An effectiveimmunization protocol for rabbits can be found in Vaitukaitis et al., J.Clin. Endocrinol. Metab. 33:988-991, 1971.

[0233] Booster injections can be given at regular intervals, andantiserum harvested when the antibody titer thereof, as determinedsemi-quantitatively, for example, by double immunodiffusion in agaragainst known concentrations of the antigen, begins to fall. See, forexample, Ouchterlony et al., Handbook of Experimental Immunology, Wier,D. (ed.), Chapter 19, Blackwell, 1973. A plateau concentration ofantibody is usually in the range of 0.1 to 0.2 mg/mL of serum (about 12μM). Affinity of the antisera for the antigen is determined by preparingcompetitive binding curves using conventional methods.

[0234] C. Antibodies Raised by Injection of cDNA

[0235] Antibodies may be raised against the transacylases of the presentinvention by subcutaneous injection of a DNA vector that expresses theenzymes in laboratory animals, such as mice. Delivery of the recombinantvector into the animals may be achieved using a hand-held form of theBiolistic system (Sanford et al., Particulate Sci. Technol. 5:27-37,1987, as described by Tang et al., Nature (London) 356:153-154, 1992).Expression vectors suitable for this purpose may include those thatexpress the cDNA of the enzyme under the transcriptional control ofeither the human β-actin promoter or the cytomegalovirus (CMV) promoter.Methods of administering naked DNA to animals in a manner resulting inexpression of the DNA in the body of the animal are well known and aredescribed, for example, in U.S. Pat. Nos. 5,620,896 (“DNA VaccinesAgainst Rotavirus Infections”); 5,643,578 (“Immunization by Inoculationof DNA Transcription Unit”); and 5,593,972 (“Genetic Immunization”), andreferences cited therein.

[0236] D. Antibody Fragments

[0237] Antibody fragments may be used in place of whole antibodies andmay be readily expressed in prokaryotic host cells. Methods of makingand using immunologically effective portions of monoclonal antibodies,also referred to as “antibody fragments,” are well known and includethose described in Better & Horowitz, Methods Enzymol. 178:476496, 1989;Glockshuber et al. Biochemistry 29:1362-1367, 1990; and U.S. Pat. Nos.5,648,237 (“Expression of Functional Antibody Fragments”); No. 4,946,778(“Single Polypeptide Chain Binding Molecules”); and No. 5,455,030(“Immunotherapy Using Single Chain Polypeptide Binding Molecules”), andreferences cited therein.

[0238] 5. Taxol™ Production in vivo

[0239] The creation of recombinant vectors and transgenic organismsexpressing the vectors are important for controlling the production oftransacylases. These vectors can be used to decrease transacylaseproduction, or to increase transacylase production. A decrease intransacylase production will likely result from the inclusion of anantisense sequence or a catalytic nucleic acid sequence that targets thetransacylase encoding nucleic acid sequence. Conversely, increasedproduction of transacylase can be achieved by including at least oneadditional transacylase encoding sequence in the vector. These vectorscan then be introduced into a host cell, thereby altering transacylaseproduction. In the case of increased production, the resultingtransacylase may be used in in vitro systems, as well as in vivo forincreased production of Taxol™, other taxoids, intermediates of theTaxol™ biosynthetic pathway, and other products.

[0240] Increased production of Taxol™ and related taxoids in vivo can beaccomplished by transforming a host cell, such as one derived from theTaxus genus, with a vector containing one or more nucleic acid sequencesencoding one or more transacylases. Furthermore, the heterologous orhomologous transacylase sequences can be placed under the control of aconstitutive promoter, or an inducible promoter. This will lead to theincreased production of transacylase, thus eliminating any rate-limitingeffect on Taxol™ production caused by the expression and/or activitylevel of the transacylase.

[0241] 6. Taxol™ Production in vitro

[0242] Currently, Taxol™ is produced by a semisynthetic method describedin Hezari and Croteau, Planta Medica 63:291-295, 1997. This methodinvolves extracting 10-deacetyl-baccatin III, or baccatin III,intermediates in the Taxol™ biosynthetic pathway, and then finishing theproduction of Taxol™ using in vitro techniques. As more enzymes areidentified in the Taxol™ biosynthetic pathway, it may become possible tocompletely synthesize Taxol™ in vitro, or at least increase the numberof steps that can be performed in vitro. Hence, the transacylases of thepresent invention may be used to facilitate the production of Taxol™ andrelated taxoids in synthetic or semi-synthetic methods. Accordingly, thepresent invention enables the production of transgenic organisms thatnot only produce increased levels of Taxol™, but also transgenicorganisms that produce increased levels of important intermediates, suchas 10-deacetyl-baccatin III and baccatin III.

[0243] Having illustrated and described the principles of the inventionin multiple embodiments and examples, it should be apparent to thoseskilled in the art that the invention can be modified in arrangement anddetail without departing from such principles. We claim allmodifications coming within the spirit and scope of the followingclaims.

1 74 1 920 DNA Taxus cuspidata 1 atgttggtct attatccccc ttttgctgggcgcctcagag agacagaaaa tggggatctg 60 gaagtggaat gcacagggga gggtgctatgtttttggaag ccatggcaga caatgagctg 120 tctgtgttgg gagattttga tgacagcaatccatcatttc agcagctact tttttcgctt 180 ccactcgata ccaatttcaa agacctctctcttctggttg ttcaggtaac tcgttttaca 240 tgtggaggct ttgttgttgg agtgagtttccaccatggtg tatgtgatgg tcgaggagcg 300 gcccaatttc ttaaaggttt ggcagaaatggcacggggag aggttaagct ctcattggaa 360 ccaatatgga atatggaact agtgaagcttgatgacccta aatacctcca attttttcac 420 tttgaattcc tacgagcgcc ttcaattgttgagaaaattg ttcaaacata ttttattata 480 gatttggaga ccataaatta tatcaaacaatctgttatgg aagaatgtaa agaattttgc 540 tcttcattcg aagttgcatc agcaatgacttggatagcaa ggacaagagc ttttcaaatt 600 ccagaaagtg agtacgtgaa aattctcttcggaatggaca tgaggaactc atttaatccc 660 cctcttccaa gcggatacta tggtaactccattggtaccg catgtgcagt ggataatgtt 720 caagacctct taagtggatc tcttttgcgtgctataatga ttataaagaa atcaaaggtc 780 tctttaaatg ataatttcaa gtcaagagctgtggtgaagc catctgaatt ggatgtgaat 840 atgaatcatg aaaacgtagt tgcatttgctgattggagcc gattgggatt tgatgaagtg 900 gattttggct gggggaaacc 920 2 306 PRTTaxus cuspidata 2 Met Leu Val Tyr Tyr Pro Pro Phe Ala Gly Arg Leu ArgGlu Thr Glu 1 5 10 15 Asn Gly Asp Leu Glu Val Glu Cys Thr Gly Glu GlyAla Met Phe Leu 20 25 30 Glu Ala Met Ala Asp Asn Glu Leu Ser Val Leu GlyAsp Phe Asp Asp 35 40 45 Ser Asn Pro Ser Phe Gln Gln Leu Leu Phe Ser LeuPro Leu Asp Thr 50 55 60 Asn Phe Lys Asp Leu Ser Leu Leu Val Val Gln ValThr Arg Phe Thr 65 70 75 80 Cys Gly Gly Phe Val Val Gly Val Ser Phe HisHis Gly Val Cys Asp 85 90 95 Gly Arg Gly Ala Ala Gln Phe Leu Lys Gly LeuAla Glu Met Ala Arg 100 105 110 Gly Glu Val Lys Leu Ser Leu Glu Pro IleTrp Asn Met Glu Leu Val 115 120 125 Lys Leu Asp Asp Pro Lys Tyr Leu GlnPhe Phe His Phe Glu Phe Leu 130 135 140 Arg Ala Pro Ser Ile Val Glu LysIle Val Gln Thr Tyr Phe Ile Ile 145 150 155 160 Asp Leu Glu Thr Ile AsnTyr Ile Lys Gln Ser Val Met Glu Glu Cys 165 170 175 Lys Glu Phe Cys SerSer Phe Glu Val Ala Ser Ala Met Thr Trp Ile 180 185 190 Ala Arg Thr ArgAla Phe Gln Ile Pro Glu Ser Glu Tyr Val Lys Ile 195 200 205 Leu Phe GlyMet Asp Met Arg Asn Ser Phe Asn Pro Pro Leu Pro Ser 210 215 220 Gly TyrTyr Gly Asn Ser Ile Gly Thr Ala Cys Ala Val Asp Asn Val 225 230 235 240Gln Asp Leu Leu Ser Gly Ser Leu Leu Arg Ala Ile Met Ile Ile Lys 245 250255 Lys Ser Lys Val Ser Leu Asn Asp Asn Phe Lys Ser Arg Ala Val Val 260265 270 Lys Pro Ser Glu Leu Asp Val Asn Met Asn His Glu Asn Val Val Ala275 280 285 Phe Ala Asp Trp Ser Arg Leu Gly Phe Asp Glu Val Asp Phe GlyTrp 290 295 300 Gly Lys 305 3 920 DNA Taxus cuspidata 3 atgctggtctattatccccc ttttgctgga aggctgagaa acacagaaaa tggggaactt 60 gaagtggagtgcacagggga gggtgccgtc tttgtggaag ccatggcgga caacgacctt 120 tcagtattacaagatttcaa tgagtacgat ccatcatttc agcagctagt tttttatctt 180 ccagaggatgtcaatattga ggacctccat cttctaactg ttcaggtaac tcgttttaca 240 tgtgggggatttgttgtggg cacaagattc caccatagtg tgtctgatgg aaaaggaatc 300 ggccagttacttaaaggcat gggagaaatg gcaagggggg agtttaagcc ctccttagaa 360 ccaatatggaatagagaaat ggtgaagcct gaagacatta tgtacctcca gtttgatcac 420 tttgatttcatacacccacc tcttaatctt gagaagtcta ttcaagcatc tatggtaata 480 agcttggagagaataaatta tatcaaacga tgcatgatgg aagaatgcaa agaatttttt 540 tctgcatttgaagttgtagt agcattgatt tggctagcaa ggacaaagtc ttttcgaatt 600 ccacccaatgagtatgtgaa aattatcttt ccaatcgaca tgaggaattc atttgactcc 660 cctcttccaaagggatacta tggtaatgct attggtaatg catgtgcaat ggataatgtc 720 aaagacctcttaaatggatc tcttttatat gctctaatgc ttataaagaa atcaaagttt 780 gctttaaatgagaatttcaa atcaagaatc ttgacaaaac catctgcatt agatgcgaat 840 atgaagcatgaaaatgtagt cggatgtggc gattggagga atttgggatt ttatgaagca 900 gatttcggctggggcaaacc 920 4 306 PRT Taxus cuspidata 4 Met Leu Val Tyr Tyr Pro ProPhe Ala Gly Arg Leu Arg Asn Thr Glu 1 5 10 15 Asn Gly Glu Leu Glu ValGlu Cys Thr Gly Glu Gly Ala Val Phe Val 20 25 30 Glu Ala Met Ala Asp AsnAsp Leu Ser Val Leu Gln Asp Phe Asn Glu 35 40 45 Tyr Asp Pro Ser Phe GlnGln Leu Val Phe Tyr Leu Pro Glu Asp Val 50 55 60 Asn Ile Glu Asp Leu HisLeu Leu Thr Val Gln Val Thr Arg Phe Thr 65 70 75 80 Cys Gly Gly Phe ValVal Gly Thr Arg Phe His His Ser Val Ser Asp 85 90 95 Gly Lys Gly Ile GlyGln Leu Leu Lys Gly Met Gly Glu Met Ala Arg 100 105 110 Gly Glu Phe LysPro Ser Leu Glu Pro Ile Trp Asn Arg Glu Met Val 115 120 125 Lys Pro GluAsp Ile Met Tyr Leu Gln Phe Asp His Phe Asp Phe Ile 130 135 140 His ProPro Leu Asn Leu Glu Lys Ser Ile Gln Ala Ser Met Val Ile 145 150 155 160Ser Leu Glu Arg Ile Asn Tyr Ile Lys Arg Cys Met Met Glu Glu Cys 165 170175 Lys Glu Phe Phe Ser Ala Phe Glu Val Val Val Ala Leu Ile Trp Leu 180185 190 Ala Arg Thr Lys Ser Phe Arg Ile Pro Pro Asn Glu Tyr Val Lys Ile195 200 205 Ile Phe Pro Ile Asp Met Arg Asn Ser Phe Asp Ser Pro Leu ProLys 210 215 220 Gly Tyr Tyr Gly Asn Ala Ile Gly Asn Ala Cys Ala Met AspAsn Val 225 230 235 240 Lys Asp Leu Leu Asn Gly Ser Leu Leu Tyr Ala LeuMet Leu Ile Lys 245 250 255 Lys Ser Lys Phe Ala Leu Asn Glu Asn Phe LysSer Arg Ile Leu Thr 260 265 270 Lys Pro Ser Ala Leu Asp Ala Asn Met LysHis Glu Asn Val Val Gly 275 280 285 Cys Gly Asp Trp Arg Asn Leu Gly PheTyr Glu Ala Asp Phe Gly Trp 290 295 300 Gly Lys 305 5 903 DNA Taxuscuspidata 5 ttttatccgt ttgcggggcg gctcagaaat aaagaaaatg gggaacttgaagtggagtgc 60 acagggcagg gtgttctgtt tctggaagcc atggccgaca gcgacctttcagtcttaaca 120 gatctggatg actacaagcc atcgtttcag cagttgattt tttctctaccacaggataca 180 gatattgagg atctccatct cttgattgtt caggtaactc gttttacatgtgggggtttt 240 gttgtgggag cgaatgtgta tagtagtgta tgtgatgcaa aaggatttggccaatttctt 300 caaggtatgg cagagatggc gagaggagag gttaagccct cgattgaaccgatatggaat 360 agagaactgg tgaagccaga acattgtatg cccttccgga tgagtcatcttcaaattata 420 cacgcacctc tgatcgagga gaaatttgtt caaacatctc ttgttataaactttgagata 480 ataaatcata tcagacaacg gatcatggaa gaatgtaaag aaagtttctcttcatttgaa 540 attgtagcag cattggtttg gctagcaaag ataaaggctt ttcaaattccacatagtgag 600 aatgtgaagc ttctttttgc aatggactta aggagatcat ttaatccccctcttccacat 660 ggatactatg gcaatgcctt cggtattgca tgtgcaatgg ataatgtccatgacctttta 720 agtggatctc ttttgcgcgc tataatgatc ataaagaaat caaagttctctttacacaaa 780 gaactcaact caaaaaccgt gatgagcccg tctgtagtag atgtcaatacgaagttcgaa 840 gatgtagttt caattagtga ctggaggcag tctatatatt atgaagtggactttggttgg 900 ggc 903 6 301 PRT Taxus cuspidata 6 Phe Tyr Pro Phe AlaGly Arg Leu Arg Asn Lys Glu Asn Gly Glu Leu 1 5 10 15 Glu Val Glu CysThr Gly Gln Gly Val Leu Phe Leu Glu Ala Met Ala 20 25 30 Asp Ser Asp LeuSer Val Leu Thr Asp Leu Asp Asp Tyr Lys Pro Ser 35 40 45 Phe Gln Gln LeuIle Phe Ser Leu Pro Gln Asp Thr Asp Ile Glu Asp 50 55 60 Leu His Leu LeuIle Val Gln Val Thr Arg Phe Thr Cys Gly Gly Phe 65 70 75 80 Val Val GlyAla Asn Val Tyr Ser Ser Val Cys Asp Ala Lys Gly Phe 85 90 95 Gly Gln PheLeu Gln Gly Met Ala Glu Met Ala Arg Gly Glu Val Lys 100 105 110 Pro SerIle Glu Pro Ile Trp Asn Arg Glu Leu Val Lys Pro Glu His 115 120 125 CysMet Pro Phe Arg Met Ser His Leu Gln Ile Ile His Ala Pro Leu 130 135 140Ile Glu Glu Lys Phe Val Gln Thr Ser Leu Val Ile Asn Phe Glu Ile 145 150155 160 Ile Asn His Ile Arg Gln Arg Ile Met Glu Glu Cys Lys Glu Ser Phe165 170 175 Ser Ser Phe Glu Ile Val Ala Ala Leu Val Trp Leu Ala Lys IleLys 180 185 190 Ala Phe Gln Ile Pro His Ser Glu Asn Val Lys Leu Leu PheAla Met 195 200 205 Asp Leu Arg Arg Ser Phe Asn Pro Pro Leu Pro His GlyTyr Tyr Gly 210 215 220 Asn Ala Phe Gly Ile Ala Cys Ala Met Asp Asn ValHis Asp Leu Leu 225 230 235 240 Ser Gly Ser Leu Leu Arg Ala Ile Met IleIle Lys Lys Ser Lys Phe 245 250 255 Ser Leu His Lys Glu Leu Asn Ser LysThr Val Met Ser Pro Ser Val 260 265 270 Val Asp Val Asn Thr Lys Phe GluAsp Val Val Ser Ile Ser Asp Trp 275 280 285 Arg Gln Ser Ile Tyr Tyr GluVal Asp Phe Gly Trp Gly 290 295 300 7 908 DNA Taxus cuspidata 7ttctacccgt ttgcagggcg gctcagaaat aaagaaaatg gggaacttga agtggagtgc 60acagggcagg gtgttctgtt tctggaagcc atggctgaca gcgacgtttc agtcttaaca 120gatctggaag actacaatcc atcgtttcag cagttgcttt tttctctacc acaggataca 180gatattgagg acctccatct cttgattgtt caggtgactc actttacatg tggggatttt 240gttgtgggag cgaatgttta tggtagtgta tgtgacggaa aaggatttgg ccagtttctt 300caaggtatgg cggagatggc gagaggagag gttaagccct cgattgaacc gatatggaat 360agagaactgg tgaagccaga agatttaatg gccctccacg tggatcatct tcgaattata 420cacacacctc taatcgagga gaaatttgtt caaacatctc ttgttataaa ctttgagata 480ataaatcata tcagacgatg catcatggaa gaatgtaaag aaagtttctc ttcattcgaa 540attgtagcag cattggtttg gctagcaaag ataaaagctt ttcgaattcc acatagtgag 600aatgtgaaga ttctctttgc aatggacgtg aggagatcat ttaagccccc tcttccaaag 660ggatactatg gcaatgccta tggtattgca tgtgcaatgg ataatgtcca ggatcttcta 720agtggatctc ttttgcatgc tataatgatc ataaagaaat caaagttctc tttacacaaa 780aaaatcaact caaaaactgt gatgagcccg tctccattag acgtcaatat gaagtttgaa 840aatgtagttt caattactga ttggaggcat tctaaatatt atgaagtaga cttcgggtgg 900ggtaaacc 908 8 302 PRT Taxus cuspidata 8 Phe Tyr Pro Phe Ala Gly Arg LeuArg Asn Lys Glu Asn Gly Glu Leu 1 5 10 15 Glu Val Glu Cys Thr Gly GlnGly Val Leu Phe Leu Glu Ala Met Ala 20 25 30 Asp Ser Asp Val Ser Val LeuThr Asp Leu Glu Asp Tyr Asn Pro Ser 35 40 45 Phe Gln Gln Leu Leu Phe SerLeu Pro Gln Asp Thr Asp Ile Glu Asp 50 55 60 Leu His Leu Leu Ile Val GlnVal Thr His Phe Thr Cys Gly Asp Phe 65 70 75 80 Val Val Gly Ala Asn ValTyr Gly Ser Val Cys Asp Gly Lys Gly Phe 85 90 95 Gly Gln Phe Leu Gln GlyMet Ala Glu Met Ala Arg Gly Glu Val Lys 100 105 110 Pro Ser Ile Glu ProIle Trp Asn Arg Glu Leu Val Lys Pro Glu Asp 115 120 125 Leu Met Ala LeuHis Val Asp His Leu Arg Ile Ile His Thr Pro Leu 130 135 140 Ile Glu GluLys Phe Val Gln Thr Ser Leu Val Ile Asn Phe Glu Ile 145 150 155 160 IleAsn His Ile Arg Arg Cys Ile Met Glu Glu Cys Lys Glu Ser Phe 165 170 175Ser Ser Phe Glu Ile Val Ala Ala Leu Val Trp Leu Ala Lys Ile Lys 180 185190 Ala Phe Arg Ile Pro His Ser Glu Asn Val Lys Ile Leu Phe Ala Met 195200 205 Asp Val Arg Arg Ser Phe Lys Pro Pro Leu Pro Lys Gly Tyr Tyr Gly210 215 220 Asn Ala Tyr Gly Ile Ala Cys Ala Met Asp Asn Val Gln Asp LeuLeu 225 230 235 240 Ser Gly Ser Leu Leu His Ala Ile Met Ile Ile Lys LysSer Lys Phe 245 250 255 Ser Leu His Lys Lys Ile Asn Ser Lys Thr Val MetSer Pro Ser Pro 260 265 270 Leu Asp Val Asn Met Lys Phe Glu Asn Val ValSer Ile Thr Asp Trp 275 280 285 Arg His Ser Lys Tyr Tyr Glu Val Asp PheGly Trp Gly Lys 290 295 300 9 1297 DNA Taxus cuspidata 9 atgggcaggttcaatgtaga tatgattgag cgagtgatcg ggcgccatgc cttcaatcgc 60 ccaaaaatatcctgcacctc tcccccatta caacaaaact agaggactaa ccaacatatt 120 atcagtctacaatgcctcca gagagtttct gtttctgcag atcctgcaaa aacaattcga 180 gaggctcctccaaggtgctg gtttattatc ccccttttgc tggaaggctg agaaaccaga 240 aaatggggatcttgaagtgg agtgcacagg ggagggtgcc gtcttgtgga agccatggcg 300 gacaacgacctttcagtatt acaagatttc aatggtacga tccatcattt cagcagctag 360 tttttaatcttcgagaggat gtcatattga ggacctccat cttctaactg ttcaggtaac 420 tcgttttacatgggaggatt tgttgtgggc acaagattcc accatagtgt atctgatgga 480 aaggaatcggccagttactt aaaggcatgg gagagatggc aaggggggag ttaagccctc 540 gttagaaccaatatggaata gagaaatggt gaagcctgag acattatgta cctccagttt 600 gatcactttgatttcataca cccacctcta atcttgagaa gtctattcaa gcatctatgg 660 taataagctttgagagataa attatatcaa acgatgcatg atggaagaat gcaaagaatt 720 tttttcgcatttgaagttgt agtagcattg atttggctgg caaggacaaa gtctttcgaa 780 ttccacccaatgagtatgtg aaaattatct ttccaatcga catgggaatt catttgactc 840 ccctcttccaaagggatact atggtaatgc tatggtaatg catgtgcaat ggataatgtc 900 aaagacctcttaaatggatc tctttatatg ctctaatgct tataaagaaa tcaaagtttg 960 ctttaaatgagatttcaaat caagaatctt gacaaaacca tctacattag atgcgaatat 1020 aagcatgaaaatgtagtcgg atgtggcgat tggaggaatt tgggattttt gaagcagatt 1080 ttggatggggaaatgcagtg aatgtaagcc ccatgcagaa caaagagagc atgaattagc 1140 tatgcaaaattattttcttt ttctccgtca gctaagaaca tgattgatgg aatcaagata 1200 ctaatgttcatgcctgatca atggtgaaac cattcaaaat tgaaatggaa gtcacaataa 1260 acaaaatgtggctaaaatat gtaactctaa gttataa 1297 10 302 PRT Taxus cuspidata 10 Phe TyrPro Phe Ala Gly Arg Leu Arg Lys Lys Glu Asp Gly Asp Ile 1 5 10 15 GluVal Val Cys Thr Glu Gln Gly Ala Leu Phe Val Glu Ala Val Ala 20 25 30 AspAsn Asp Leu Ser Ala Val Arg Asp Leu Asp Glu Tyr Asn Pro Leu 35 40 45 PheArg Gln Leu Gln Ser Thr Leu Pro Leu Asp Thr Asp Cys Lys Asp 50 55 60 LeuHis Leu Met Thr Val Gln Val Thr Arg Phe Thr Cys Gly Gly Phe 65 70 75 80Val Met Gly Thr Ser Val His Gln Ser Ile Cys Asp Gly Asn Gly Leu 85 90 95Gly Gln Phe Phe Lys Ser Met Ala Glu Met Val Arg Gly Glu Val Lys 100 105110 Pro Ser Ile Glu Pro Val Trp Asn Arg Glu Leu Val Lys Pro Glu Asp 115120 125 Tyr Ile His Leu Gln Leu Tyr Ile Gly Glu Phe Ile Arg Pro Pro Leu130 135 140 Ala Phe Glu Lys Val Gly Gln Thr Ser Leu Ile Ile Ser Phe GluLys 145 150 155 160 Ile Asn His Ile Lys Arg Cys Ile Met Glu Glu Ser LysGlu Ser Phe 165 170 175 Ser Ser Phe Glu Ile Val Thr Ala Leu Val Trp LeuAla Arg Thr Arg 180 185 190 Ala Phe Gln Ile Pro His Asn Glu Asp Val ThrLeu Leu Leu Ala Met 195 200 205 Asp Ala Arg Arg Ser Phe Asp Pro Pro IlePro Lys Gly Tyr Tyr Gly 210 215 220 Asn Val Ile Gly Thr Ala Cys Ala ThrAsn Asn Val His Asn Leu Leu 225 230 235 240 Ser Gly Ser Leu Leu His AlaLeu Thr Ile Ile Lys Lys Ser Met Ser 245 250 255 Ser Phe Tyr Glu Asn IleThr Ser Arg Val Leu Val Asn Pro Ser Thr 260 265 270 Leu Asp Leu Ser MetLys Tyr Glu Asn Val Val Thr Ile Ser Asp Trp 275 280 285 Arg Arg Leu GlyTyr Asn Glu Val Asp Phe Gly Trp Gly Lys 290 295 300 11 911 DNA Taxuscuspidata 11 ttctatccgt tcgcggggcg tctcaggaaa aaagaaaatg gagatcttgaagtggagtgc 60 acaggggagg gtgctctgtt tgtggaagcc atggctgaca ctgacctctcagtcttagga 120 gatttggatg actacagtcc ttcacttgag caactacttt tttgtcttccgcctgataca 180 gatattgagg acatccatcc tctggtggtt caggtaactc gttttacatgtggaggtttt 240 gttgtagggg tgagtttctg ccatggtata tgtgatggac taggagcaggccagtttctt 300 atagccatgg gagagatggc aaggggagag attaagccct cctcggagccaatatggaag 360 agagaattgc tgaagccgga agacccttta taccggttcc agtattatcactttcaattg 420 atttgcccgc cttcaacatt cgggaaaata gttcaaggat ctcttgttataacctctgag 480 acaataaatt gtatcaaaca atgccttagg gaagaaagta aagaattttgctctgcgttc 540 gaagttgtat ctgcattggc ttggatagca aggacaaggg ctcttcaaattccacatagt 600 gagaatgtga agcttatttt tgcaatggac atgagaaaat tatttaatccaccactttcg 660 aagggatact acggtaattt tgttggtacc gtatgtgcaa tggataatgtcaaggaccta 720 ttaagtggat ctcttttgcg tgttgtaagg attataaaga aagcaaaggtctctttaaat 780 gagcatttca cgtcaacaat cgtgacaccc cgttctggat cagatgagagtatcaattat 840 gaaaacatag ttggatttgg tgatcgaagg cgattgggat ttgatgaagtagactttggc 900 tggggcaaac c 911 12 303 PRT Taxus cuspidata 12 Phe TyrPro Phe Ala Gly Arg Leu Arg Lys Lys Glu Asn Gly Asp Leu 1 5 10 15 GluVal Glu Cys Thr Gly Glu Gly Ala Leu Phe Val Glu Ala Met Ala 20 25 30 AspThr Asp Leu Ser Val Leu Gly Asp Leu Asp Asp Tyr Ser Pro Ser 35 40 45 LeuGlu Gln Leu Leu Phe Cys Leu Pro Pro Asp Thr Asp Ile Glu Asp 50 55 60 IleHis Pro Leu Val Val Gln Val Thr Arg Phe Thr Cys Gly Gly Phe 65 70 75 80Val Val Gly Val Ser Phe Cys His Gly Ile Cys Asp Gly Leu Gly Ala 85 90 95Gly Gln Phe Leu Ile Ala Met Gly Glu Met Ala Arg Gly Glu Ile Lys 100 105110 Pro Ser Ser Glu Pro Ile Trp Lys Arg Glu Leu Leu Lys Pro Glu Asp 115120 125 Pro Leu Tyr Arg Phe Gln Tyr Tyr His Phe Gln Leu Ile Cys Pro Pro130 135 140 Ser Thr Phe Gly Lys Ile Val Gln Gly Ser Leu Val Ile Thr SerGlu 145 150 155 160 Thr Ile Asn Cys Ile Lys Gln Cys Leu Arg Glu Glu SerLys Glu Phe 165 170 175 Cys Ser Ala Phe Glu Val Val Ser Ala Leu Ala TrpIle Ala Arg Thr 180 185 190 Arg Ala Leu Gln Ile Pro His Ser Glu Asn ValLys Leu Ile Phe Ala 195 200 205 Met Asp Met Arg Lys Leu Phe Asn Pro ProLeu Ser Lys Gly Tyr Tyr 210 215 220 Gly Asn Phe Val Gly Thr Val Cys AlaMet Asp Asn Val Lys Asp Leu 225 230 235 240 Leu Ser Gly Ser Leu Leu ArgVal Val Arg Ile Ile Lys Lys Ala Lys 245 250 255 Val Ser Leu Asn Glu HisPhe Thr Ser Thr Ile Val Thr Pro Arg Ser 260 265 270 Gly Ser Asp Glu SerIle Asn Tyr Glu Asn Ile Val Gly Phe Gly Asp 275 280 285 Arg Arg Arg LeuGly Phe Asp Glu Val Asp Phe Gly Trp Gly Lys 290 295 300 13 968 DNA Taxuscuspidata 13 ttttatccgt ttgcaggccg gctcagaaat aaagaaaatg gggaacttgaagtggagtgc 60 acagggcagg gtgttctgtt tctggaagcc atggctgaca gcgacctttcagtcttaaca 120 gatctcgata actacaatcc atcgtttcag cagttgattt tttctctaccacaggataca 180 gatattgagg acctccatct cttgattgtt caggtaactc gttttacatgtgggggtttt 240 gttgtgggag cgaatgtgta tggtagtaca tgcgatgcaa aaggatttggccagtttctt 300 caaggtatgg cagagatggc gagaggagag gttaagccct cgattgaaccgatatggaat 360 aagagaactg gtgaagctag aagagaggtt aagccctcga ttgaaccgatatggaataag 420 agaactggtg aagctagaag attgtatgcc ctttccggga tgagtcatcttcaaattata 480 cacgcacctg taattgagga gaaatttgtt caaacatctc ttgttataaactttgagata 540 ataaatcata tcagacgacg catcatggaa gaatgcaaag aaagtttatcttcatttgaa 600 attgtagcag cattggtttg gctagcaaag ataaaggctt ttcaaattccacatagtgag 660 aatgtgaagc ttctttttgc aatggacttg aggagatcat ttaatccccctcttccacat 720 ggatactatg gcaatgcctt tggtattgca tgtgcaatgg ataatgtccatgaccttcta 780 agtggatctc ttttgcgcac tataatgatc ataaagaaat caaagttctctttacacaaa 840 gaactcaact caaaaaccgt gatgagctcg tctgtagtag atgtcaatacgaagtttgaa 900 gatgtagttt caattagtga ttggaggcat tctatatatt atgaagtggactttggctgg 960 ggtaaacc 968 14 322 PRT Taxus cuspidata 14 Phe Tyr ProPhe Ala Gly Arg Leu Arg Asn Lys Glu Asn Gly Glu Leu 1 5 10 15 Glu ValGlu Cys Thr Gly Gln Gly Val Leu Phe Leu Glu Ala Met Ala 20 25 30 Asp SerAsp Leu Ser Val Leu Thr Asp Leu Asp Asn Tyr Asn Pro Ser 35 40 45 Phe GlnGln Leu Ile Phe Ser Leu Pro Gln Asp Thr Asp Ile Glu Asp 50 55 60 Leu HisLeu Leu Ile Val Gln Val Thr Arg Phe Thr Cys Gly Gly Phe 65 70 75 80 ValVal Gly Ala Asn Val Tyr Gly Ser Thr Cys Asp Ala Lys Gly Phe 85 90 95 GlyGln Phe Leu Gln Gly Met Ala Glu Met Ala Arg Gly Glu Val Lys 100 105 110Pro Ser Ile Glu Pro Ile Trp Asn Lys Arg Thr Gly Glu Ala Arg Arg 115 120125 Glu Val Lys Pro Ser Ile Glu Pro Ile Trp Asn Lys Arg Thr Gly Glu 130135 140 Ala Arg Arg Leu Tyr Ala Leu Ser Gly Met Ser His Leu Gln Ile Ile145 150 155 160 His Ala Pro Val Ile Glu Glu Lys Phe Val Gln Thr Ser LeuVal Ile 165 170 175 Asn Phe Glu Ile Ile Asn His Ile Arg Arg Arg Ile MetGlu Glu Cys 180 185 190 Lys Glu Ser Leu Ser Ser Phe Glu Ile Val Ala AlaLeu Val Trp Leu 195 200 205 Ala Lys Ile Lys Ala Phe Gln Ile Pro His SerGlu Asn Val Lys Leu 210 215 220 Leu Phe Ala Met Asp Leu Arg Arg Ser PheAsn Pro Pro Leu Pro His 225 230 235 240 Gly Tyr Tyr Gly Asn Ala Phe GlyIle Ala Cys Ala Met Asp Asn Val 245 250 255 His Asp Leu Leu Ser Gly SerLeu Leu Arg Thr Ile Met Ile Ile Lys 260 265 270 Lys Ser Lys Phe Ser LeuHis Lys Glu Leu Asn Ser Lys Thr Val Met 275 280 285 Ser Ser Ser Val ValAsp Val Asn Thr Lys Phe Glu Asp Val Val Ser 290 295 300 Ile Ser Asp TrpArg His Ser Ile Tyr Tyr Glu Val Asp Phe Gly Trp 305 310 315 320 Gly Lys15 908 DNA Taxus cuspidata 15 ttttacccgt ttgcggggcg tctcagaaataaagaaaatg gggatctgga agtggagtgt 60 acaggggagg gtgctgtgtt tgtggaagccatggcggaca cagatctttc ttccttggga 120 gatttggatg ctcataatcc ttcatttcaccagctttctg tttcacctcc agtggattct 180 gatattgagg gcctccatct tgcagctcttcaggtaactc gttttacatg tgggggtttt 240 gttctaggag taagtttgaa ccaaagtgtgtgcgatggaa aaggattggg aaattttctt 300 aaaggtgtgg cagagatggt gaggggaaaagataagccct caattgaacc agtatggaat 360 agagaaatgg taaagtttga agactatacacgcctccaat tttatcacca tgaattcata 420 caaccacctt taatagatga gaaaattgttcaaaaatctc ttgttataaa cttggagaca 480 ataaatatta tcaaacgatg tattatggaagaatatacaa aatttttctc tacattcgaa 540 atcgtagcag caatggtttg gctagcaagaacaaaagctt tcaaaattcc acatagtgaa 600 aatgcagagc ttctctttac aatggatatgagggaatcat ttaatccccc tcttccaaag 660 ggatactatg gtaatgttat gggtatagtatgtgcattgg ataatgtcaa acacctatta 720 agtggatcta ttttgcgtgc tgcaatggttatacagaaat caaggttttt ctttacagag 780 aatttccggt taagatctat gacacaaccatctgcattga ctgtgaagat caagcacaaa 840 aatgtagttg catgtagtga ttggaggcaatatggatatg atgaagtgga cttcggctgg 900 ggtaaacc 908 16 302 PRT Taxuscuspidata 16 Phe Tyr Pro Phe Ala Gly Arg Leu Arg Asn Lys Glu Asn Gly AspLeu 1 5 10 15 Glu Val Glu Cys Thr Gly Glu Gly Ala Val Phe Val Glu AlaMet Ala 20 25 30 Asp Thr Asp Leu Ser Ser Leu Gly Asp Leu Asp Ala His AsnPro Ser 35 40 45 Phe His Gln Leu Ser Val Ser Pro Pro Val Asp Ser Asp IleGlu Gly 50 55 60 Leu His Leu Ala Ala Leu Gln Val Thr Arg Phe Thr Cys GlyGly Phe 65 70 75 80 Val Leu Gly Val Ser Leu Asn Gln Ser Val Cys Asp GlyLys Gly Leu 85 90 95 Gly Asn Phe Leu Lys Gly Val Ala Glu Met Val Arg GlyLys Asp Lys 100 105 110 Pro Ser Ile Glu Pro Val Trp Asn Arg Glu Met ValLys Phe Glu Asp 115 120 125 Tyr Thr Arg Leu Gln Phe Tyr His His Glu PheIle Gln Pro Pro Leu 130 135 140 Ile Asp Glu Lys Ile Val Gln Lys Ser LeuVal Ile Asn Leu Glu Thr 145 150 155 160 Ile Asn Ile Ile Lys Arg Cys IleMet Glu Glu Tyr Thr Lys Phe Phe 165 170 175 Ser Thr Phe Glu Ile Val AlaAla Met Val Trp Leu Ala Arg Thr Lys 180 185 190 Ala Phe Lys Ile Pro HisSer Glu Asn Ala Glu Leu Leu Phe Thr Met 195 200 205 Asp Met Arg Glu SerPhe Asn Pro Pro Leu Pro Lys Gly Tyr Tyr Gly 210 215 220 Asn Val Met GlyIle Val Cys Ala Leu Asp Asn Val Lys His Leu Leu 225 230 235 240 Ser GlySer Ile Leu Arg Ala Ala Met Val Ile Gln Lys Ser Arg Phe 245 250 255 PhePhe Thr Glu Asn Phe Arg Leu Arg Ser Met Thr Gln Pro Ser Ala 260 265 270Leu Thr Val Lys Ile Lys His Lys Asn Val Val Ala Cys Ser Asp Trp 275 280285 Arg Gln Tyr Gly Tyr Asp Glu Val Asp Phe Gly Trp Gly Lys 290 295 30017 908 DNA Taxus cuspidata 17 ttctacccgt ttgcggggcg gatgagaaacaaaggagatg gggaactgga agtggattgc 60 acgggggaag gtgctctgtt tgtagaagccatggcggacg acaacctttc agtgttggga 120 ggttttgatt accacaatcc agcatttgggaagctacttt actcactacc actggatacc 180 cctattcacg acctccatcc tctggttgttcaggtaactc gttttacctg cggggggttt 240 gttgtgggat taagtttgga ccatactatatgtgatggac gtggtgcagg tcaatttctt 300 aaagccctag cagaratggc gaggggagaggctaagccct cattggaacc aatatggaat 360 agagagttgt tgaagcccga agaccttatacgcctgcaat tttatcactt tgaatcgatg 420 cgtccacctc caatagttga agaaatggttcaatcatcta ttattataaa tgctgagaca 480 ataagtaata tsaaacaata cattatggaagaatgtaaag aatcttgttc tgcatttgat 540 gtcgtaggag gattggcttg gctagccaggacaaaggctt ttcaaattcc acatacagag 600 aatgtgatgg ttatttttgc agtggatgcgaggagatcat ttgatccacc acttccaaag 660 ggttactatg gtaatgtcgt tggtaatgcatgtgcattgg ataatgttca agacctctta 720 aatggatctc ttttgcgtgc tacaatgattataaagaaat caaaggtatc tttaaaagag 780 aatataaggg caaaaacttt gacgataccatctatagtag atgtgaatgt gaaacatgaa 840 aacatagttg gattaggcga tttgagacgactgggattta atgaagtgga cttcggctgg 900 ggsaagcc 908 18 302 PRT Taxuscuspidata VARIANT 164 Any amino acid 18 Phe Tyr Pro Phe Ala Gly Arg MetArg Asn Lys Gly Asp Gly Glu Leu 1 5 10 15 Glu Val Asp Cys Thr Gly GluGly Ala Leu Phe Val Glu Ala Met Ala 20 25 30 Asp Asp Asn Leu Ser Val LeuGly Gly Phe Asp Tyr His Asn Pro Ala 35 40 45 Phe Gly Lys Leu Leu Tyr SerLeu Pro Leu Asp Thr Pro Ile His Asp 50 55 60 Leu His Pro Leu Val Val GlnVal Thr Arg Phe Thr Cys Gly Gly Phe 65 70 75 80 Val Val Gly Leu Ser LeuAsp His Thr Ile Cys Asp Gly Arg Gly Ala 85 90 95 Gly Gln Phe Leu Lys AlaLeu Ala Glu Met Ala Arg Gly Glu Ala Lys 100 105 110 Pro Ser Leu Glu ProIle Met Asn Arg Glu Leu Leu Lys Pro Glu Asp 115 120 125 Leu Ile Arg LeuGln Phe Tyr His Phe Glu Ser Met Arg Pro Pro Pro 130 135 140 Ile Val GluGlu Met Val Gln Ser Ser Ile Ile Ile Asn Ala Glu Thr 145 150 155 160 IleSer Asn Xaa Lys Gln Tyr Ile Met Glu Glu Cys Lys Glu Ser Cys 165 170 175Ser Ala Phe Asp Val Val Gly Gly Leu Ala Met Leu Ala Arg Thr Lys 180 185190 Ala Phe Gln Ile Pro His Thr Glu Asn Val Met Val Ile Phe Ala Val 195200 205 Asp Ala Arg Arg Ser Phe Asp Pro Pro Leu Pro Lys Gly Tyr Tyr Gly210 215 220 Asn Val Val Gly Asn Ala Cys Ala Leu Asp Asn Val Gln Asp LeuLeu 225 230 235 240 Asn Gly Ser Leu Leu Arg Ala Thr Met Ile Ile Lys LysSer Lys Val 245 250 255 Ser Leu Lys Glu Asn Ile Arg Ala Lys Thr Leu ThrIle Pro Ser Ile 260 265 270 Val Asp Val Asn Val Lys His Glu Asn Ile ValGly Leu Gly Asp Leu 275 280 285 Arg Arg Leu Gly Phe Asn Glu Val Asp PheGly Trp Gly Lys 290 295 300 19 911 DNA Taxus cuspidata 19 tactacccgctggcaggacg gctcagaagt aaagaaattg gggaacttga agtggagtgc 60 acaggggatggtgctctgtt tgtggaagcc atggtggaag acaccatttc agtcttacga 120 gatctggatgacctcaatcc atcatttcag cagttagttt tttggcatcc attggacact 180 gctattgaggatcttcatct tgtgattgtt caggtaacac gttttacatg tgggggcatt 240 gccgttggagtgactttgcc ccatagtgta tgtgatggac gtggagcacc ccagtttgtt 300 acagcactggcagaaatggc gaggggagag gttaagccct tattagaacc aatatggaat 360 agagaattgttgaaccctga agaccctcta catctccagt taaatcaatt tgattcgata 420 tgcccacctccaatgctcga ggaattgggt caagcttctt ttgttataaa tgttgacacc 480 atagaatatatgaaacaatg tgttatggag gaatgtaatg atttttgttc gtcctttgaa 540 gtagtggcagcattggtttg gatagcaagg acaaaggctc ttcaaattcc acatactgag 600 aatgtgaagcttctctttgc gatggatttg aggaaattat ttaatccccc acttccaaat 660 ggatattatggtaatgccat tggtactgca tatgcaatgg ataatgtcca agacctctta 720 aatggatctcttttgcgtgc tataatgatt ataaaaaaag caaaggctga tttaaaagat 780 aattattcgaggtcaagggt agttacaaac ccaaattcat tagatgtgaa caagaaatcc 840 aacaacattcttgcattgag tgactggagg cggttgggat tttatgaagc cgattttggc 900 tggggcaagc c911 20 303 PRT Taxus cuspidata 20 Tyr Tyr Pro Leu Ala Gly Arg Leu ArgSer Lys Glu Ile Gly Glu Leu 1 5 10 15 Glu Val Glu Cys Thr Gly Asp GlyAla Leu Phe Val Glu Ala Met Val 20 25 30 Glu Asp Thr Ile Ser Val Leu ArgAsp Leu Asp Asp Leu Asn Pro Ser 35 40 45 Phe Gln Gln Leu Val Phe Trp HisPro Leu Asp Thr Ala Ile Glu Asp 50 55 60 Leu His Leu Val Ile Val Gln ValThr Arg Phe Thr Cys Gly Gly Ile 65 70 75 80 Ala Val Gly Val Thr Leu ProHis Ser Val Cys Asp Gly Arg Gly Ala 85 90 95 Pro Gln Phe Val Thr Ala LeuAla Glu Met Ala Arg Gly Glu Val Lys 100 105 110 Pro Leu Leu Glu Pro IleTrp Asn Arg Glu Leu Leu Asn Pro Glu Asp 115 120 125 Pro Leu His Leu GlnLeu Asn Gln Phe Asp Ser Ile Cys Pro Pro Pro 130 135 140 Met Leu Glu GluLeu Gly Gln Ala Ser Phe Val Ile Asn Val Asp Thr 145 150 155 160 Ile GluTyr Met Lys Gln Cys Val Met Glu Glu Cys Asn Asp Phe Cys 165 170 175 SerSer Phe Glu Val Val Ala Ala Leu Val Trp Ile Ala Arg Thr Lys 180 185 190Ala Leu Gln Ile Pro His Thr Glu Asn Val Lys Leu Leu Phe Ala Met 195 200205 Asp Leu Arg Lys Leu Phe Asn Pro Pro Leu Pro Asn Gly Tyr Tyr Gly 210215 220 Asn Ala Ile Gly Thr Ala Tyr Ala Met Asp Asn Val Gln Asp Leu Leu225 230 235 240 Asn Gly Ser Leu Leu Arg Ala Ile Met Ile Ile Lys Lys AlaLys Ala 245 250 255 Asp Leu Lys Asp Asn Tyr Ser Arg Ser Arg Val Val ThrAsn Pro Asn 260 265 270 Ser Leu Asp Val Asn Lys Lys Ser Asn Asn Ile LeuAla Leu Ser Asp 275 280 285 Trp Arg Arg Leu Gly Phe Tyr Glu Ala Asp PheGly Trp Gly Lys 290 295 300 21 911 DNA Taxus cuspidata 21 tactacccgctggcaggacg gctcagaagt aaagaaattg gggaacttga agtggagtgc 60 acaggggatggtgctctgtt tgtggaagcc atggtggaag acaccatttc agtcttacga 120 gatctggatgacctcaatcc atcatttcag cagttagttt tttggcatcc attggacact 180 gctattgaggatcttcatct tgtgattgtt caggtaacac gttttacatg tgggggcatt 240 gccgttggagtgactttgcc ccatagtgta tgtgatggac gtggagcacc ccagtttgtt 300 acagcactggcagaaatggc gaggggagag gttaagccct tattagaacc aatatggaat 360 agagaattgttgaaccctga agaccctcta catctccagt taaatcaatt tgattcgata 420 tgcccacctccaatgctcga ggaattgggt caagcttctt ttgttataaa tgttgacacc 480 atagaatatatgaaacaatg tgttatggag gaatgtaatg atttttgttc gtcctttgaa 540 gtagtggcagcattggtttg gatagcaagg acaaaggctc ttcaaattcc acatactgag 600 aatgtgaagcttctctttgc gatggatttg aggaaattat ttaatccccc acttccaaat 660 ggatattatggtaatgccat tggtactgca tatgcaatgg ataatgtcca agacctctta 720 aatggatctcttttgcgtgc tataatgatt ataaaaaaag caaaggctga tttaaaagat 780 aattattcgaggtcaagggt agttacaaac ccaaattcat tagatgtgaa caagaaatcc 840 aacaacattcttgcattgag tgactggagg cggttgggat tttatgaagc cgattttggc 900 tggggcaagc c911 22 306 PRT Taxus cuspidata 22 Tyr Tyr Pro Leu Ala Gly Arg Leu GluThr Cys Asp Gly Met Val Tyr 1 5 10 15 Ile Asp Cys Asn Asp Lys Gly AlaGlu Phe Ile Glu Ala Tyr Ala Ser 20 25 30 Pro Glu Leu Gly Val Ala Glu IleMet Ala Asp Ser Phe Pro His Gln 35 40 45 Ile Phe Ala Phe Asn Gly Val LeuAsn Ile Asp Gly His Phe Met Pro 50 55 60 Leu Leu Ala Val Gln Ala Thr LysLeu Lys Asp Gly Ile Ala Leu Ala 65 70 75 80 Ile Thr Val Asn His Ala ValAla Asp Ala Thr Ser Val Trp His Phe 85 90 95 Ile Ser Ser Trp Ala Gln LeuCys Lys Glu Pro Ser Asn Ile Pro Leu 100 105 110 Leu Pro Leu His Thr ArgCys Phe Thr Thr Ile Ser Pro Ile Lys Leu 115 120 125 Asp Ile Gln Tyr SerSer Thr Thr Thr Glu Ser Ile Asp Asn Phe Phe 130 135 140 Pro Pro Pro LeuThr Glu Lys Ile Phe His Phe Ser Gly Lys Thr Ile 145 150 155 160 Ser ArgLeu Lys Glu Glu Ala Met Glu Ala Cys Lys Asp Lys Ser Ile 165 170 175 SerIle Ser Ser Phe Gln Ala Leu Cys Gly His Leu Trp Gln Ser Ile 180 185 190Thr Arg Ala Arg Gly Leu Ser Pro Ser Glu Pro Thr Thr Ile Lys Ile 195 200205 Ala Val Asn Cys Arg Pro Arg Ile Val Pro Pro Leu Pro Asn Ser Tyr 210215 220 Phe Gly Asn Ala Val Gln Val Val Asp Val Thr Met Thr Thr Glu Glu225 230 235 240 Leu Leu Gly Asn Gly Gly Ala Cys Ala Ala Leu Ile Leu HisGln Lys 245 250 255 Ile Ser Ala His Gln Asp Thr Gln Ile Arg Ala Glu LeuAsp Lys Pro 260 265 270 Pro Lys Ile Val His Thr Asn Asn Leu Ile Pro CysAsn Ile Ile Ala 275 280 285 Met Ala Gly Ser Pro Arg Phe Pro Ile Tyr AsnAsn Asp Phe Gly Trp 290 295 300 Gly Lys 305 23 908 DNA Taxus cuspidata23 ttctacccgt tcgcggggcg gatcagacag aaagaaaatg aggaactgga agtggagtgc 60acaggggagg gtgcactgtt tgtggaagcc gtggtggaca atgatctttc agtcttgaaa 120gatttggatg cccaaaatgc atcttatgag cagttgctct tttcgcttcc gcccaataca 180caggttcagg acctccatcc tctgattctt caggtaactc gttttaaatg tggaggtttt 240gttgtgggag ttggtttcca ccatagtata tgtgacgcac gaggaggaac tcaatttctt 300ctaggcctag cagatatggc aaggggagag actaagcctt tagtggaacc agtatggaat 360agagaactga taaaccctga agatctaatg cacctccaat ttcataagtt tggtttgata 420cgccaacctc taaaacttga tgaaatttgt caagcatctt ttactataaa ctcaaagata 480ataaattaca tcaaacaatg tgttatagaa gaatgtaatg aaattttctc tgcatttgaa 540gttgtagtag cattaacttg gatagcaagg acaaaggctt ttcaaattcc acatagtgag 600aatgtgatga tgctctttgg aatggacgcg aggaaatatt ttaatccccc acttccaaag 660ggatattatg gtaatgccat tggtacttca tgtgtaattg aaaatgtaca agacctctta 720aatggatctc tttcgcgtgc tgtaatgatc acaaagaaat caaaggtccc tttaattgag 780aatttaaggt caagaattgt ggcgaaccaa tctggagtag atgaggaaat taagcatgaa 840aacgtagttg gatttggaga ttggaggcga ttgggatttc atgaagtgga cttcggctgg 900ggcaagcc 908 24 302 PRT Taxus cuspidata 24 Phe Tyr Pro Phe Ala Gly ArgIle Arg Gln Lys Glu Asn Glu Glu Leu 1 5 10 15 Glu Val Glu Cys Thr GlyGlu Gly Ala Leu Phe Val Glu Ala Val Val 20 25 30 Asp Asn Asp Leu Ser ValLeu Lys Asp Leu Asp Ala Gln Asn Ala Ser 35 40 45 Tyr Glu Gln Leu Leu PheSer Leu Pro Pro Asn Thr Gln Val Gln Asp 50 55 60 Leu His Pro Leu Ile LeuGln Val Thr Arg Phe Lys Cys Gly Gly Phe 65 70 75 80 Val Val Gly Val GlyPhe His His Ser Ile Cys Asp Ala Arg Gly Gly 85 90 95 Thr Gln Phe Leu LeuGly Leu Ala Asp Met Ala Arg Gly Glu Thr Lys 100 105 110 Pro Leu Val GluPro Val Trp Asn Arg Glu Leu Ile Asn Pro Glu Asp 115 120 125 Leu Met HisLeu Gln Phe His Lys Phe Gly Leu Ile Arg Gln Pro Leu 130 135 140 Lys LeuAsp Glu Ile Cys Gln Ala Ser Phe Thr Ile Asn Ser Lys Ile 145 150 155 160Ile Asn Tyr Ile Lys Gln Cys Val Ile Glu Glu Cys Asn Glu Ile Phe 165 170175 Ser Ala Phe Glu Val Val Val Ala Leu Thr Trp Ile Ala Arg Thr Lys 180185 190 Ala Phe Gln Ile Pro His Ser Glu Asn Val Met Met Leu Phe Gly Met195 200 205 Asp Ala Arg Lys Tyr Phe Asn Pro Pro Leu Pro Lys Gly Tyr TyrGly 210 215 220 Asn Ala Ile Gly Thr Ser Cys Val Ile Glu Asn Val Gln AspLeu Leu 225 230 235 240 Asn Gly Ser Leu Ser Arg Ala Val Met Ile Thr LysLys Ser Lys Val 245 250 255 Pro Leu Ile Glu Asn Leu Arg Ser Arg Ile ValAla Asn Gln Ser Gly 260 265 270 Val Asp Glu Glu Ile Lys His Glu Asn ValVal Gly Phe Gly Asp Trp 275 280 285 Arg Arg Leu Gly Phe His Glu Val AspPhe Gly Trp Gly Lys 290 295 300 25 1320 DNA Taxus cuspidata 25atgggcaggt tcaatgtaga tatgattgag cgagtgatcg tggcgccatg ccttcaatcg 60cccaaaaata tcctgcacct ctcccccatt gacaacaaaa ctagaggact aaccaacata 120ttatcagtct acaatgcctc ccagagagtt tctgtttctg cagatcctgc aaaaacaatt 180cgagaggctc tctccaaggt gctggtttat tatccccctt ttgctggaag gctgagaaac 240acagaaaatg gggatcttga agtggagtgc acaggggagg gtgccgtctt tgtggaagcc 300atggcggaca acgacctttc agtattacaa gatttcaatg agtacgatcc atcatttcag 360cagctagttt ttaatcttcg agaggatgtc aatattgagg acctccatct tctaactgtt 420caggtaactc gttttacatg tggaggattt gttgtgggca caagattcca ccatagtgta 480tctgatggaa aaggaatcgg ccagttactt aaaggcatgg gagagatggc aaggggggag 540tttaagccct cgttagaacc aatatggaat agagaaatgg tgaagcctga agacattatg 600tacctccagt ttgatcactt tgatttcata cacccacctc ttaatcttga gaagtctatt 660caagcatcta tggtaataag ctttgagaga ataaattata tcaaacgatg catgatggaa 720gaatgcaaag aatttttttc tgcatttgaa gttgtagtag cattgatttg gctggcaagg 780acaaagtctt ttcgaattcc acccaatgag tatgtgaaaa ttatctttcc aatcgacatg 840aggaattcat ttgactcccc tcttccaaag ggatactatg gtaatgctat tggtaatgca 900tgtgcaatgg ataatgtcaa agacctctta aatggatctc ttttatatgc tctaatgctt 960ataaagaaat caaagtttgc tttaaatgag aatttcaaat caagaatctt gacaaaacca 1020tctacattag atgcgaatat gaagcatgaa aatgtagtcg gatgtggcga ttggaggaat 1080ttgggatttt atgaagcaga ttttggatgg ggaaatgcag tgaatgtaag ccccatgcag 1140caacaaagag agcatgaatt agctatgcaa aattattttc tttttctccg atcagctaag 1200aacatgattg atggaatcaa gatactaatg ttcatgcctg catcaatggt gaaaccattc 1260aaaattgaaa tggaagtcac aataaacaaa tatgtggcta aaatatgtaa ctctaagtta 132026 440 PRT Taxus cuspidata 26 Met Gly Arg Phe Asn Val Asp Met Ile GluArg Val Ile Val Ala Pro 1 5 10 15 Cys Leu Gln Ser Pro Lys Asn Ile LeuHis Leu Ser Pro Ile Asp Asn 20 25 30 Lys Thr Arg Gly Leu Thr Asn Ile LeuSer Val Tyr Asn Ala Ser Gln 35 40 45 Arg Val Ser Val Ser Ala Asp Pro AlaLys Thr Ile Arg Glu Ala Leu 50 55 60 Ser Lys Val Leu Val Tyr Tyr Pro ProPhe Ala Gly Arg Leu Arg Asn 65 70 75 80 Thr Glu Asn Gly Asp Leu Glu ValGlu Cys Thr Gly Glu Gly Ala Val 85 90 95 Phe Val Glu Ala Met Ala Asp AsnAsp Leu Ser Val Leu Gln Asp Phe 100 105 110 Asn Glu Tyr Asp Pro Ser PheGln Gln Leu Val Phe Asn Leu Arg Glu 115 120 125 Asp Val Asn Ile Glu AspLeu His Leu Leu Thr Val Gln Val Thr Arg 130 135 140 Phe Thr Cys Gly GlyPhe Val Val Gly Thr Arg Phe His His Ser Val 145 150 155 160 Ser Asp GlyLys Gly Ile Gly Gln Leu Leu Lys Gly Met Gly Glu Met 165 170 175 Ala ArgGly Glu Phe Lys Pro Ser Leu Glu Pro Ile Trp Asn Arg Glu 180 185 190 MetVal Lys Pro Glu Asp Ile Met Tyr Leu Gln Phe Asp His Phe Asp 195 200 205Phe Ile His Pro Pro Leu Asn Leu Glu Lys Ser Ile Gln Ala Ser Met 210 215220 Val Ile Ser Phe Glu Arg Ile Asn Tyr Ile Lys Arg Cys Met Met Glu 225230 235 240 Glu Cys Lys Glu Phe Phe Ser Ala Phe Glu Val Val Val Ala LeuIle 245 250 255 Trp Leu Ala Arg Thr Lys Ser Phe Arg Ile Pro Pro Asn GluTyr Val 260 265 270 Lys Ile Ile Phe Pro Ile Asp Met Arg Asn Ser Phe AspSer Pro Leu 275 280 285 Pro Lys Gly Tyr Tyr Gly Asn Ala Ile Gly Asn AlaCys Ala Met Asp 290 295 300 Asn Val Lys Asp Leu Leu Asn Gly Ser Leu LeuTyr Ala Leu Met Leu 305 310 315 320 Ile Lys Lys Ser Lys Phe Ala Leu AsnGlu Asn Phe Lys Ser Arg Ile 325 330 335 Leu Thr Lys Pro Ser Thr Leu AspAla Asn Met Lys His Glu Asn Val 340 345 350 Val Gly Cys Gly Asp Trp ArgAsn Leu Gly Phe Tyr Glu Ala Asp Phe 355 360 365 Gly Trp Gly Asn Ala ValAsn Val Ser Pro Met Gln Gln Gln Arg Glu 370 375 380 His Glu Leu Ala MetGln Asn Tyr Phe Leu Phe Leu Arg Ser Ala Lys 385 390 395 400 Asn Met IleAsp Gly Ile Lys Ile Leu Met Phe Met Pro Ala Ser Met 405 410 415 Val LysPro Phe Lys Ile Glu Met Glu Val Thr Ile Asn Lys Tyr Val 420 425 430 AlaLys Ile Cys Asn Ser Lys Leu 435 440 27 1317 DNA Taxus cuspidata 27atggagaaga cagatttaca cgtaaatctg attgagaaag tgatggttgg gccatccccg 60cctctgccca aaaccaccct gcaactctcc tccatagaca acctgccagg ggtaagagga 120agcattttca atgccttgtt aatttacaat gcctctccct ctcccaccat gatctctgca 180gatcctgcaa aaccaattag agaagctctc gccaagatcc tggtttatta tccccctttt 240gctgggcgcc tcagagagac agaaaatggg gatctggaag tggaatgcac aggggagggt 300gctatgtttt tggaagccat ggcagacaat gagctgtctg tgttgggaga ttttgatgac 360agcaatccat catttcagca gctacttttt tcgcttccac tcgataccaa tttcaaagac 420ctctctcttc tggttgttca ggtaactcgt tttacatgtg gaggctttgt tgttggagtg 480agtttccacc atggtgtatg tgatggtcga ggagcggccc aatttcttaa aggtttggca 540gagatggcac ggggagaggt taagctctca ttggaaccaa tatggaatag ggaactagtg 600aagcttgatg accctaaata ccttcaattt tttcactttg aattcctacg agcgccttca 660attgttgaga aaattgttca aacatatttt attatagatt ttgagaccat aaattatatc 720aaacaatctg ttatggaaga atgtaaagaa ttttgctctt cattcgaagt tgcatcagca 780atgacttgga tagcaaggac aagagctttt caaattccag aaagtgagta cgtgaaaatt 840ctcttcggaa tggacatgag gaactcattt aatccccctc ttccaagcgg atactatggt 900aactccattg gtaccgcatg tgcagtggat aatgttcaag acctcttaag tggatctctt 960ttgcgtgcta taatgattat aaagaaatca aaggtctctt taaatgataa tttcaagtca 1020agagctgtgg tgaagccatc tgaattggat gtgaatatga atcatgaaaa cgtagttgca 1080tttgctgatt ggagccgatt gggatttgat gaagtggatt ttggttgggg gaatgcggtg 1140agtgtaagcc ctgtgcaaca acagtctgcg ttagcaatgc aaaattattt tcttttccta 1200aaaccttcca agaacaagcc cgatggaatc aaaatattaa tgtttctgcc cctatcaaaa 1260atgaagtcat tcaaaattga aatggaagcc atgatgaaaa aatatgtggc taaagta 1317 28439 PRT Taxus cuspidata 28 Met Glu Lys Thr Asp Leu His Val Asn Leu IleGlu Lys Val Met Val 1 5 10 15 Gly Pro Ser Pro Pro Leu Pro Lys Thr ThrLeu Gln Leu Ser Ser Ile 20 25 30 Asp Asn Leu Pro Gly Val Arg Gly Ser IlePhe Asn Ala Leu Leu Ile 35 40 45 Tyr Asn Ala Ser Pro Ser Pro Thr Met IleSer Ala Asp Pro Ala Lys 50 55 60 Pro Ile Arg Glu Ala Leu Ala Lys Ile LeuVal Tyr Tyr Pro Pro Phe 65 70 75 80 Ala Gly Arg Leu Arg Glu Thr Glu AsnGly Asp Leu Glu Val Glu Cys 85 90 95 Thr Gly Glu Gly Ala Met Phe Leu GluAla Met Ala Asp Asn Glu Leu 100 105 110 Ser Val Leu Gly Asp Phe Asp AspSer Asn Pro Ser Phe Gln Gln Leu 115 120 125 Leu Phe Ser Leu Pro Leu AspThr Asn Phe Lys Asp Leu Ser Leu Leu 130 135 140 Val Val Gln Val Thr ArgPhe Thr Cys Gly Gly Phe Val Val Gly Val 145 150 155 160 Ser Phe His HisGly Val Cys Asp Gly Arg Gly Ala Ala Gln Phe Leu 165 170 175 Lys Gly LeuAla Glu Met Ala Arg Gly Glu Val Lys Leu Ser Leu Glu 180 185 190 Pro IleTrp Asn Arg Glu Leu Val Lys Leu Asp Asp Pro Lys Tyr Leu 195 200 205 GlnPhe Phe His Phe Glu Phe Leu Arg Ala Pro Ser Ile Val Glu Lys 210 215 220Ile Val Gln Thr Tyr Phe Ile Ile Asp Phe Glu Thr Ile Asn Tyr Ile 225 230235 240 Lys Gln Ser Val Met Glu Glu Cys Lys Glu Phe Cys Ser Ser Phe Glu245 250 255 Val Ala Ser Ala Met Thr Trp Ile Ala Arg Thr Arg Ala Phe GlnIle 260 265 270 Pro Glu Ser Glu Tyr Val Lys Ile Leu Phe Gly Met Asp MetArg Asn 275 280 285 Ser Phe Asn Pro Pro Leu Pro Ser Gly Tyr Tyr Gly AsnSer Ile Gly 290 295 300 Thr Ala Cys Ala Val Asp Asn Val Gln Asp Leu LeuSer Gly Ser Leu 305 310 315 320 Leu Arg Ala Ile Met Ile Ile Lys Lys SerLys Val Ser Leu Asn Asp 325 330 335 Asn Phe Lys Ser Arg Ala Val Val LysPro Ser Glu Leu Asp Val Asn 340 345 350 Met Asn His Glu Asn Val Val AlaPhe Ala Asp Trp Ser Arg Leu Gly 355 360 365 Phe Asp Glu Val Asp Phe GlyTrp Gly Asn Ala Val Ser Val Ser Pro 370 375 380 Val Gln Gln Gln Ser AlaLeu Ala Met Gln Asn Tyr Phe Leu Phe Leu 385 390 395 400 Lys Pro Ser LysAsn Lys Pro Asp Gly Ile Lys Ile Leu Met Phe Leu 405 410 415 Pro Leu SerLys Met Lys Ser Phe Lys Ile Glu Met Glu Ala Met Met 420 425 430 Lys LysTyr Val Ala Lys Val 435 29 15 PRT Artificial Sequence Description ofArtificial Sequenceproteolytic fragment 29 Thr Thr Leu Gln Leu Ser SerIle Asp Asn Leu Pro Gly Val Arg 1 5 10 15 30 11 PRT Artificial SequenceDescription of Artificial Sequenceproteolytic fragment 30 Ile Leu ValTyr Tyr Pro Pro Phe Ala Gly Arg 1 5 10 31 12 PRT Artificial SequenceDescription of Artificial Sequenceproteolytic fragment 31 Phe Thr CysGly Gly Phe Val Val Gly Val Ser Phe 1 5 10 32 12 PRT Artificial SequenceDescription of Artificial Sequenceproteolytic fragment 32 Lys Gly LeuAla Glu Ile Ala Arg Gly Glu Val Lys 1 5 10 33 15 PRT Artificial SequenceDescription of Artificial Sequenceproteolytic fragment 33 Asn Leu ProAsn Asp Thr Asn Pro Ser Ser Gly Tyr Tyr Gly Asn 1 5 10 15 34 20 DNAArtificial Sequence Description of Artificial SequencePCR primer 34atnctngtnt attatccncc 20 35 20 DNA Artificial Sequence Description ofArtificial SequencePCR primer 35 tattatccnc cntttgcngg 20 36 20 DNAArtificial Sequence Description of Artificial SequencePCR primer 36ttctatccnt tcgcnggnag 20 37 20 DNA Artificial Sequence Description ofArtificial SequencePCR primer 37 tactatccnt tngcnggnag 20 38 20 DNAArtificial Sequence Description of Artificial SequencePCR primer 38ctaaaaccna ccccntttgg 20 39 7 PRT Artificial Sequence Description ofArtificial Sequenceconsensus sequence 39 Phe Tyr Pro Phe Ala Gly Arg 1 540 7 PRT Artificial Sequence Description of Artificial Sequenceconsensussequence 40 Tyr Tyr Pro Leu Ala Gly Arg 1 5 41 7 PRT Artificial SequenceDescription of Artificial Sequenceconsensus sequence 41 Asp Phe Gly TrpGly Lys Pro 1 5 42 24 DNA Artificial Sequence Description of ArtificialSequencePCR primer 42 cctcatcttt cccccattga taat 24 43 27 DNA ArtificialSequence Description of Artificial SequencePCR primer 43 aaaaagaaaataattttgcc atgcaag 27 44 1320 DNA Taxus cuspidata 44 atggcaggctcaacagaatt tgtggtaaga agcttagaga gagtgatggt ggctccaagc 60 cagccatcgcccaaagcttt cctgcagctc tccacccttg acaatctacc aggggtgaga 120 gaaaacatttttaacacctt gttagtctac aatgcctcag acagagtttc cgtagatcct 180 gcaaaagtaattcggcaggc tctctccaag gtgttggtgt actattcccc ttttgcaggg 240 cgtctcaggaaaaaagaaaa tggagatctt gaagtggagt gcacagggga gggtgctctg 300 tttgtggaagccatggctga cactgacctc tcagtcttag gagatttgga tgactacagt 360 ccttcacttgagcaactact tttttgtctt ccgcctgata cagatattga ggacatccat 420 cctctggtggttcaggtaac tcgttttaca tgtggaggtt ttgttgtagg ggtgagtttc 480 tgccatggtatatgtgatgg actaggagca ggccagtttc ttatagccat gggagagatg 540 gcaaggggagagattaagcc ctcctcggag ccaatatgga agagagaatt gctgaagccg 600 gaagaccctttataccggtt ccagtattat cactttcaat tgatttgccc gccttcaaca 660 ttcgggaaaatagttcaagg atctcttgtt ataacctctg agacaataaa ttgtatcaaa 720 caatgccttagggaagaaag taaagaattt tgctctgcgt tcgaagttgt atctgcattg 780 gcttggatagcaaggacaag ggctcttcaa attccacata gtgagaatgt gaagcttatt 840 tttgcaatggacatgagaaa attatttaat ccaccacttt cgaagggata ctacggtaat 900 tttgttggtaccgtatgtgc aatggataat gtcaaggacc tattaagtgg atctcttttg 960 cgtgttgtaaggattataaa gaaagcaaag gtctctttaa atgagcattt cacgtcaaca 1020 atcgtgacaccccgttctgg atcagatgag agtatcaatt atgaaaacat agttggattt 1080 ggtgatcgaaggcgattggg atttgatgaa gtagactttg ggtgggggca tgcagataat 1140 gtaagtctcgtgcaacatgg attgaaggat gtttcagtcg tgcaaagtta ttttcttttc 1200 atacgacctcccaagaataa ccccgatgga atcaagatcc tatcgttcat gcccccgtca 1260 atagtgaaatccttcaaatt tgaaatggaa accatgacaa acaaatatgt aactaagcct 1320 45 440 PRTTaxus cuspidata 45 Met Ala Gly Ser Thr Glu Phe Val Val Arg Ser Leu GluArg Val Met 1 5 10 15 Val Ala Pro Ser Gln Pro Ser Pro Lys Ala Phe LeuGln Leu Ser Thr 20 25 30 Leu Asp Asn Leu Pro Gly Val Arg Glu Asn Ile PheAsn Thr Leu Leu 35 40 45 Val Tyr Asn Ala Ser Asp Arg Val Ser Val Asp ProAla Lys Val Ile 50 55 60 Arg Gln Ala Leu Ser Lys Val Leu Val Tyr Tyr SerPro Phe Ala Gly 65 70 75 80 Arg Leu Arg Lys Lys Glu Asn Gly Asp Leu GluVal Glu Cys Thr Gly 85 90 95 Glu Gly Ala Leu Phe Val Glu Ala Met Ala AspThr Asp Leu Ser Val 100 105 110 Leu Gly Asp Leu Asp Asp Tyr Ser Pro SerLeu Glu Gln Leu Leu Phe 115 120 125 Cys Leu Pro Pro Asp Thr Asp Ile GluAsp Ile His Pro Leu Val Val 130 135 140 Gln Val Thr Arg Phe Thr Cys GlyGly Phe Val Val Gly Val Ser Phe 145 150 155 160 Cys His Gly Ile Cys AspGly Leu Gly Ala Gly Gln Phe Leu Ile Ala 165 170 175 Met Gly Glu Met AlaArg Gly Glu Ile Lys Pro Ser Ser Glu Pro Ile 180 185 190 Trp Lys Arg GluLeu Leu Lys Pro Glu Asp Pro Leu Tyr Arg Phe Gln 195 200 205 Tyr Tyr HisPhe Gln Leu Ile Cys Pro Pro Ser Thr Phe Gly Lys Ile 210 215 220 Val GlnGly Ser Leu Val Ile Thr Ser Glu Thr Ile Asn Cys Ile Lys 225 230 235 240Gln Cys Leu Arg Glu Glu Ser Lys Glu Phe Cys Ser Ala Phe Glu Val 245 250255 Val Ser Ala Leu Ala Trp Ile Ala Arg Thr Arg Ala Leu Gln Ile Pro 260265 270 His Ser Glu Asn Val Lys Leu Ile Phe Ala Met Asp Met Arg Lys Leu275 280 285 Phe Asn Pro Pro Leu Ser Lys Gly Tyr Tyr Gly Asn Phe Val GlyThr 290 295 300 Val Cys Ala Met Asp Asn Val Lys Asp Leu Leu Ser Gly SerLeu Leu 305 310 315 320 Arg Val Val Arg Ile Ile Lys Lys Ala Lys Val SerLeu Asn Glu His 325 330 335 Phe Thr Ser Thr Ile Val Thr Pro Arg Ser GlySer Asp Glu Ser Ile 340 345 350 Asn Tyr Glu Asn Ile Val Gly Phe Gly AspArg Arg Arg Leu Gly Phe 355 360 365 Asp Glu Val Asp Phe Gly Trp Gly HisAla Asp Asn Val Ser Leu Val 370 375 380 Gln His Gly Leu Lys Asp Val SerVal Val Gln Ser Tyr Phe Leu Phe 385 390 395 400 Ile Arg Pro Pro Lys AsnAsn Pro Asp Gly Ile Lys Ile Leu Ser Phe 405 410 415 Met Pro Pro Ser IleVal Lys Ser Phe Lys Phe Glu Met Glu Thr Met 420 425 430 Thr Asn Lys TyrVal Thr Lys Pro 435 440 46 36 DNA Artificial Sequence Description ofArtificial Sequence PCR Primer 46 gggaattcca tatggcaggc tcaacagaatttgtgg 36 47 32 DNA Artificial Sequence Description of ArtificialSequence PCR Primer 47 gtttatacat tgattcggaa ctagatctga tc 32 48 6 PRTArtificial Sequence Description of Artificial Sequence 6 amino acidmotif found in acyl transferases 48 His Xaa Xaa Xaa Asp Gly 1 5 49 1332DNA Taxus cuspidata 49 atggagaagt ctggttcagc agatctacat gtaaatatcattgagcgagt ggtggtggcg 60 ccatgccagc cgacgcccaa aacaatcctg cagctctctagcattgacaa aatgggagga 120 ggatttgcca acgtattgct agtcttcggt gcctcccatggcgtttctgc agatcctgca 180 aaaacaattc gagaggctct ctccaagacc ttggtcttttatttcccttt tgctgggcgg 240 ctcagaaaga aagaagatgg ggatatcgaa gtggagtgcatagagcaggg agctctgttc 300 gtggaagcca tggcggacaa cgatctttca gtcgtacgagatctggatga gtacaatcca 360 ttatttcggc agctacaatc ttcgctttca ctggatacagattacaagga cctccatctt 420 atgactgttc aggtaactcc gtttacatgt gggggttttgtcatgggaac gagtgtacac 480 caaagtatat gcgatggaaa tggattgggg caattttttaaaagcatggc agagatagtg 540 aggggagaag ttaagccctc aatcgaacca atatggaatagagaattggt gaagcctgaa 600 gactatatac acctccagtt gtatgtcagt gaattcattcgcccaccttt agtagttgag 660 aaagttgggc aaacatctct tgttataagc ttcgagaaaataaatcatat caaacgatgc 720 attatggaag aaagtaaaga atctttctct tcatttgaaattgtaacagc aatggtttgg 780 ctagcaagga caagggcttt tcaaattcca cacaacgaggatgtgactct tctccttgca 840 atggatgcaa ggagatcatt tgacccccct attccgaagggatactacgg taatgtcatt 900 ggtactacat atgcaaaaga taatgtccac aacctcttaagtggatctct tttgcatgct 960 ctaacagtta taaagaaatc aatgtcctca ttttatgagaatatgacctc aagagtcttg 1020 gtgaacccat ctacattaga tttgagtatg aagtatgaaaatgtagtttc acttagtgat 1080 tggagccggt tgggacataa tgaagtggac tttgggtggggaaatgcaat aaatgtaagc 1140 actctgcaac aacaatggga aaatgaggta gctataccaactttttttac tttccttcaa 1200 actcccaaga atataccaga tggaatcaag atactaatgttcatgccccc atcaagagag 1260 aaaacattcg aaattgaagt ggaagccatg ataagaaaatatttgactaa agtgtcgcat 1320 tcaaagctat aa 1332 50 443 PRT Taxus cuspidata50 Met Glu Lys Ser Gly Ser Ala Asp Leu His Val Asn Ile Ile Glu Arg 1 510 15 Val Val Val Ala Pro Cys Gln Pro Thr Pro Lys Thr Ile Leu Gln Leu 2025 30 Ser Ser Ile Asp Lys Met Gly Gly Gly Phe Ala Asn Val Leu Leu Val 3540 45 Phe Gly Ala Ser His Gly Val Ser Ala Asp Pro Ala Lys Thr Ile Arg 5055 60 Glu Ala Leu Ser Lys Thr Leu Val Phe Tyr Phe Pro Phe Ala Gly Arg 6570 75 80 Leu Arg Lys Lys Glu Asp Gly Asp Ile Glu Val Glu Cys Ile Glu Gln85 90 95 Gly Ala Leu Phe Val Glu Ala Met Ala Asp Asn Asp Leu Ser Val Val100 105 110 Arg Asp Leu Asp Glu Tyr Asn Pro Leu Phe Arg Gln Leu Gln SerSer 115 120 125 Leu Ser Leu Asp Thr Asp Tyr Lys Asp Leu His Leu Met ThrVal Gln 130 135 140 Val Thr Pro Phe Thr Cys Gly Gly Phe Val Met Gly ThrSer Val His 145 150 155 160 Gln Ser Ile Cys Asp Gly Asn Gly Leu Gly GlnPhe Phe Lys Ser Met 165 170 175 Ala Glu Ile Val Arg Gly Glu Val Lys ProSer Ile Glu Pro Ile Trp 180 185 190 Asn Arg Glu Leu Val Lys Pro Glu AspTyr Ile His Leu Gln Leu Tyr 195 200 205 Val Ser Glu Phe Ile Arg Pro ProLeu Val Val Glu Lys Val Gly Gln 210 215 220 Thr Ser Leu Val Ile Ser PheGlu Lys Ile Asn His Ile Lys Arg Cys 225 230 235 240 Ile Met Glu Glu SerLys Glu Ser Phe Ser Ser Phe Glu Ile Val Thr 245 250 255 Ala Met Val TrpLeu Ala Arg Thr Arg Ala Phe Gln Ile Pro His Asn 260 265 270 Glu Asp ValThr Leu Leu Leu Ala Met Asp Ala Arg Arg Ser Phe Asp 275 280 285 Pro ProIle Pro Lys Gly Tyr Tyr Gly Asn Val Ile Gly Thr Thr Tyr 290 295 300 AlaLys Asp Asn Val His Asn Leu Leu Ser Gly Ser Leu Leu His Ala 305 310 315320 Leu Thr Val Ile Lys Lys Ser Met Ser Ser Phe Tyr Glu Asn Met Thr 325330 335 Ser Arg Val Leu Val Asn Pro Ser Thr Leu Asp Leu Ser Met Lys Tyr340 345 350 Glu Asn Val Val Ser Leu Ser Asp Trp Ser Arg Leu Gly His AsnGlu 355 360 365 Val Asp Phe Gly Trp Gly Asn Ala Ile Asn Val Ser Thr LeuGln Gln 370 375 380 Gln Trp Glu Asn Glu Val Ala Ile Pro Thr Phe Phe ThrPhe Leu Gln 385 390 395 400 Thr Pro Lys Asn Ile Pro Asp Gly Ile Lys IleLeu Met Phe Met Pro 405 410 415 Pro Ser Arg Glu Lys Thr Phe Glu Ile GluVal Glu Ala Met Ile Arg 420 425 430 Lys Tyr Leu Thr Lys Val Ser His SerLys Leu 435 440 51 1338 DNA Taxus cuspidata 51 atgaagaaga caggttcgtttgcagagttc catgtgaata tgattgagcg agtcatggtg 60 agaccgtgcc tgccttcgcccaaaacaatc ctccctctct ccgccattga caacatggca 120 agagcttttt ctaacgtattgctggtctac gctgccaaca tggacagagt ctctgcagat 180 cctgcaaaag tgattcgagaggctctctcc aaggtgctgg tttattatta cccttttgct 240 gggcggctca gaaataaagaaaatggggaa cttgaagtgg agtgcacagg gcagggtgtt 300 ctgtttctgg aagccatggctgacagcgac ctttcagtct taacagatct ggataactac 360 aatccatcgt ttcagcagttgattttttct ctaccacagg atacagatat tgaggacctc 420 catctcttga ttgttcaggtaactcgtttt acatgtgggg gttttgttgt gggagcgaat 480 gtgtatggta gtgcatgcgatgcaaaagga tttggccagt ttcttcaaag tatggcagag 540 atggcgagag gagaggttaagccctcgatt gaaccgatat ggaatagaga actggtgaag 600 ctagaacatt gtatgcccttccggatgagt catcttcaaa ttatacatgc acctgtaatt 660 gaggagaaat ttgttcaaacatctcttgtt ataaactttg agataataaa tcatatcaga 720 cgacgcatca tggaagaacgcaaagaaagt ttatcttcat ttgaaattgt agcagcattg 780 gtttggctag caaagataaaggcttttcaa attccacata gtgagaatgt gaagcttctt 840 tttgcaatgg acttgaggagatcatttaat ccccctcttc cacatggata ctatggcaat 900 gcctttggta ttgcatgtgcaatggataat gtccatgacc ttctaagtgg atctcttttg 960 cgcactataa tgatcataaagaaatcaaag ttctctttac acaaagaact caactcaaaa 1020 accgtgatga gctcatctgtagtagatgtc aatacgaagt ttgaagatgt agtttcaatt 1080 agtgattgga ggcattctatatattatgaa gtggactttg ggtggggaga tgcaatgaac 1140 gtgagcacta tgctacaacaacaggagcac gagaaatctc tgccaactta tttttctttc 1200 ctacaatcta ctaagaacatgccagatgga atcaagatgc taatgtttat gcctccatca 1260 aaactgaaaa aattcaaaattgaaatagaa gctatgataa aaaaatatgt gactaaagtg 1320 tgtccgtcaa agttatga1338 52 445 PRT Taxus cuspidata 52 Met Lys Lys Thr Gly Ser Phe Ala GluPhe His Val Asn Met Ile Glu 1 5 10 15 Arg Val Met Val Arg Pro Cys LeuPro Ser Pro Lys Thr Ile Leu Pro 20 25 30 Leu Ser Ala Ile Asp Asn Met AlaArg Ala Phe Ser Asn Val Leu Leu 35 40 45 Val Tyr Ala Ala Asn Met Asp ArgVal Ser Ala Asp Pro Ala Lys Val 50 55 60 Ile Arg Glu Ala Leu Ser Lys ValLeu Val Tyr Tyr Tyr Pro Phe Ala 65 70 75 80 Gly Arg Leu Arg Asn Lys GluAsn Gly Glu Leu Glu Val Glu Cys Thr 85 90 95 Gly Gln Gly Val Leu Phe LeuGlu Ala Met Ala Asp Ser Asp Leu Ser 100 105 110 Val Leu Thr Asp Leu AspAsn Tyr Asn Pro Ser Phe Gln Gln Leu Ile 115 120 125 Phe Ser Leu Pro GlnAsp Thr Asp Ile Glu Asp Leu His Leu Leu Ile 130 135 140 Val Gln Val ThrArg Phe Thr Cys Gly Gly Phe Val Val Gly Ala Asn 145 150 155 160 Val TyrGly Ser Ala Cys Asp Ala Lys Gly Phe Gly Gln Phe Leu Gln 165 170 175 SerMet Ala Glu Met Ala Arg Gly Glu Val Lys Pro Ser Ile Glu Pro 180 185 190Ile Trp Asn Arg Glu Leu Val Lys Leu Glu His Cys Met Pro Phe Arg 195 200205 Met Ser His Leu Gln Ile Ile His Ala Pro Val Ile Glu Glu Lys Phe 210215 220 Val Gln Thr Ser Leu Val Ile Asn Phe Glu Ile Ile Asn His Ile Arg225 230 235 240 Arg Arg Ile Met Glu Glu Arg Lys Glu Ser Leu Ser Ser PheGlu Ile 245 250 255 Val Ala Ala Leu Val Trp Leu Ala Lys Ile Lys Ala PheGln Ile Pro 260 265 270 His Ser Glu Asn Val Lys Leu Leu Phe Ala Met AspLeu Arg Arg Ser 275 280 285 Phe Asn Pro Pro Leu Pro His Gly Tyr Tyr GlyAsn Ala Phe Gly Ile 290 295 300 Ala Cys Ala Met Asp Asn Val His Asp LeuLeu Ser Gly Ser Leu Leu 305 310 315 320 Arg Thr Ile Met Ile Ile Lys LysSer Lys Phe Ser Leu His Lys Glu 325 330 335 Leu Asn Ser Lys Thr Val MetSer Ser Ser Val Val Asp Val Asn Thr 340 345 350 Lys Phe Glu Asp Val ValSer Ile Ser Asp Trp Arg His Ser Ile Tyr 355 360 365 Tyr Glu Val Asp PheGly Trp Gly Asp Ala Met Asn Val Ser Thr Met 370 375 380 Leu Gln Gln GlnGlu His Glu Lys Ser Leu Pro Thr Tyr Phe Ser Phe 385 390 395 400 Leu GlnSer Thr Lys Asn Met Pro Asp Gly Ile Lys Met Leu Met Phe 405 410 415 MetPro Pro Ser Lys Leu Lys Lys Phe Lys Ile Glu Ile Glu Ala Met 420 425 430Ile Lys Lys Tyr Val Thr Lys Val Cys Pro Ser Lys Leu 435 440 445 53 1326DNA Taxus cuspidata 53 atggagaagg caggctcaac agacttccat gtaaagaaatttgatccagt catggtagcc 60 ccaagccttc catcgcccaa agctaccgtc cagctctctgtcgttgatag cctaacaatc 120 tgcaggggaa tttttaacac gttgttggtt ttcaatgcccctgacaacat ttctgcagat 180 cctgtaaaaa taattagaga ggctctctcc aaggtgttggtgtattattt ccctcttgct 240 gggcggctca gaagtaaaga aattggggaa cttgaagtggagtgcacagg ggatggtgct 300 ctgtttgtgg aagccatggt ggaagacacc atttcagtcttacgagatct ggatgacctc 360 aatccatcat ttcagcagtt agttttttgg catccattggacactgctat tgaggatctt 420 catcttgtga ttgttcaggt aacacgtttt acatgtgggggcattgccgt tggagtgact 480 ttgccccata gtgtatgtga tggacgtgga gcagcccagtttgttacagc actggcagag 540 atggcgaggg gagaggttaa gccctcacta gaaccaatatggaatagaga attgttgaac 600 cctgaagacc ctctacatct ccagttaaat caatttgattcgatatgccc acctccaatg 660 ctggaggaat tgggtcaagc ttcttttgtt ataaacgttgacaccataga atatatgaag 720 caatgtgtca tggaggaatg taatgaattt tgttcgtcttttgaagtagt ggcagcattg 780 gtttggatag cacggacaaa ggctcttcaa attccacatactgagaatgt gaagcttctc 840 tttgcgatgg atttgaggaa attatttaat cccccacttccaaatggata ttatggtaat 900 gccattggta ctgcatatgc aatggataat gtccaagacctcttaaatgg atctcttttg 960 cgtgctataa tgattataaa aaaagcaaag gctgatttaaaagataatta ttcgaggtca 1020 agggtagtta caaacccata ttcattagat gtgaacaagaaatccgacaa cattcttgca 1080 ttgagtgact ggaggcggtt gggattttat gaagccgattttgggtgggg aggtccactg 1140 aatgtaagtt ccctgcaacg gttggaaaat ggattgcctatgtttagtac ttttctatac 1200 ctactacctg ccaaaaacaa gtctgatgga atcaagctgctactgtcttg tatgccacca 1260 acaacattga aatcatttaa aattgtaatg gaagctatgatagagaaata tgtaagtaaa 1320 gtgtga 1326 54 441 PRT Taxus cuspidata 54 MetGlu Lys Ala Gly Ser Thr Asp Phe His Val Lys Lys Phe Asp Pro 1 5 10 15Val Met Val Ala Pro Ser Leu Pro Ser Pro Lys Ala Thr Val Gln Leu 20 25 30Ser Val Val Asp Ser Leu Thr Ile Cys Arg Gly Ile Phe Asn Thr Leu 35 40 45Leu Val Phe Asn Ala Pro Asp Asn Ile Ser Ala Asp Pro Val Lys Ile 50 55 60Ile Arg Glu Ala Leu Ser Lys Val Leu Val Tyr Tyr Phe Pro Leu Ala 65 70 7580 Gly Arg Leu Arg Ser Lys Glu Ile Gly Glu Leu Glu Val Glu Cys Thr 85 9095 Gly Asp Gly Ala Leu Phe Val Glu Ala Met Val Glu Asp Thr Ile Ser 100105 110 Val Leu Arg Asp Leu Asp Asp Leu Asn Pro Ser Phe Gln Gln Leu Val115 120 125 Phe Trp His Pro Leu Asp Thr Ala Ile Glu Asp Leu His Leu ValIle 130 135 140 Val Gln Val Thr Arg Phe Thr Cys Gly Gly Ile Ala Val GlyVal Thr 145 150 155 160 Leu Pro His Ser Val Cys Asp Gly Arg Gly Ala AlaGln Phe Val Thr 165 170 175 Ala Leu Ala Glu Met Ala Arg Gly Glu Val LysPro Ser Leu Glu Pro 180 185 190 Ile Trp Asn Arg Glu Leu Leu Asn Pro GluAsp Pro Leu His Leu Gln 195 200 205 Leu Asn Gln Phe Asp Ser Ile Cys ProPro Pro Met Leu Glu Glu Leu 210 215 220 Gly Gln Ala Ser Phe Val Ile AsnVal Asp Thr Ile Glu Tyr Met Lys 225 230 235 240 Gln Cys Val Met Glu GluCys Asn Glu Phe Cys Ser Ser Phe Glu Val 245 250 255 Val Ala Ala Leu ValTrp Ile Ala Arg Thr Lys Ala Leu Gln Ile Pro 260 265 270 His Thr Glu AsnVal Lys Leu Leu Phe Ala Met Asp Leu Arg Lys Leu 275 280 285 Phe Asn ProPro Leu Pro Asn Gly Tyr Tyr Gly Asn Ala Ile Gly Thr 290 295 300 Ala TyrAla Met Asp Asn Val Gln Asp Leu Leu Asn Gly Ser Leu Leu 305 310 315 320Arg Ala Ile Met Ile Ile Lys Lys Ala Lys Ala Asp Leu Lys Asp Asn 325 330335 Tyr Ser Arg Ser Arg Val Val Thr Asn Pro Tyr Ser Leu Asp Val Asn 340345 350 Lys Lys Ser Asp Asn Ile Leu Ala Leu Ser Asp Trp Arg Arg Leu Gly355 360 365 Phe Tyr Glu Ala Asp Phe Gly Trp Gly Gly Pro Leu Asn Val SerSer 370 375 380 Leu Gln Arg Leu Glu Asn Gly Leu Pro Met Phe Ser Thr PheLeu Tyr 385 390 395 400 Leu Leu Pro Ala Lys Asn Lys Ser Asp Gly Ile LysLeu Leu Leu Ser 405 410 415 Cys Met Pro Pro Thr Thr Leu Lys Ser Phe LysIle Val Met Glu Ala 420 425 430 Met Ile Glu Lys Tyr Val Ser Lys Val 435440 55 1347 DNA Taxus cuspidata 55 atggagaagg gaaatgcgag tgatgtgccagaattgcatg tacagatctg tgagcgggtg 60 atggtgaaac catgcgtgcc ttctccttcgccaaatcttg tcctccagct ctccgcggtg 120 gacagactgc cagggatgaa gtttgctacttttagcgccg tgttagtcta caatgccagc 180 tctcactcca tttttgcaaa tcctgcacagattattcggc aggctctctc caaggtgttg 240 cagtattatc ccgcttttgc cgggcggatcagacagaaag aaaatgagga actggaagtg 300 gagtgcacag gggagggtgc gctgtttgtggaagccctgg tcgacaatga tctttcagtc 360 ttgcgagatt tggatgccca aaatgcatcttatgagcagt tgctcttttc gcttccgccc 420 aatatacagg ttcaggacct ccatcctctgattcttcagg taactcgttt tacgtgtgga 480 ggttttgttg tgggagtagg ttttcaccatggtatatgcg acgcacgagg aggaactcaa 540 tttcttcaag gcctagcaga tatggcaaggggagagacta agcctttagt ggaaccagta 600 tggaatagag aactgataaa gcccgaagatctaatgcacc tccaatttca taagtttggt 660 ttgatacgcc aacctctaaa acttgatgaaatttgtcaag catcttttac tataaactca 720 gagataataa attacatcaa acaatgtgttatagaagaat gtaacgaaat tttctctgca 780 tttgaagttg tagtagcatt aacttggatagcaaggacaa aggcttttca aattccacat 840 aatgagaatg tgatgatgct ctttggaatggacgcgagga aatattttaa tcccccactt 900 ccaaagggat attatggtaa tgccattggtacttcatgtg taattgaaaa tgtacaagac 960 ctcttaaatg gatctctttc gcgtgctgtaatgattacaa agaaatcaaa gatcccttta 1020 attgagaatt taaggtcaag aattgtggcgaaccaatctg gagtagatga ggaaattaag 1080 catgaaaacg tagttggatt tggagattggaggcgattgg gatttcatga agtggacttc 1140 ggatcgggag atgcagtgaa catcagccccatacaacaac gactagagga tgatcaattg 1200 gctatgcgaa attattttct tttccttcgaccttacaagg acatgcctaa tggaatcaaa 1260 atactaatgt tcatggatcc atcaagagtgaaattattca aagatgaaat ggaagccatg 1320 ataattaaat atatgccgaa agcctaa 134756 448 PRT Taxus cuspidata 56 Met Glu Lys Gly Asn Ala Ser Asp Val ProGlu Leu His Val Gln Ile 1 5 10 15 Cys Glu Arg Val Met Val Lys Pro CysVal Pro Ser Pro Ser Pro Asn 20 25 30 Leu Val Leu Gln Leu Ser Ala Val AspArg Leu Pro Gly Met Lys Phe 35 40 45 Ala Thr Phe Ser Ala Val Leu Val TyrAsn Ala Ser Ser His Ser Ile 50 55 60 Phe Ala Asn Pro Ala Gln Ile Ile ArgGln Ala Leu Ser Lys Val Leu 65 70 75 80 Gln Tyr Tyr Pro Ala Phe Ala GlyArg Ile Arg Gln Lys Glu Asn Glu 85 90 95 Glu Leu Glu Val Glu Cys Thr GlyGlu Gly Ala Leu Phe Val Glu Ala 100 105 110 Leu Val Asp Asn Asp Leu SerVal Leu Arg Asp Leu Asp Ala Gln Asn 115 120 125 Ala Ser Tyr Glu Gln LeuLeu Phe Ser Leu Pro Pro Asn Ile Gln Val 130 135 140 Gln Asp Leu His ProLeu Ile Leu Gln Val Thr Arg Phe Thr Cys Gly 145 150 155 160 Gly Phe ValVal Gly Val Gly Phe His His Gly Ile Cys Asp Ala Arg 165 170 175 Gly GlyThr Gln Phe Leu Gln Gly Leu Ala Asp Met Ala Arg Gly Glu 180 185 190 ThrLys Pro Leu Val Glu Pro Val Trp Asn Arg Glu Leu Ile Lys Pro 195 200 205Glu Asp Leu Met His Leu Gln Phe His Lys Phe Gly Leu Ile Arg Gln 210 215220 Pro Leu Lys Leu Asp Glu Ile Cys Gln Ala Ser Phe Thr Ile Asn Ser 225230 235 240 Glu Ile Ile Asn Tyr Ile Lys Gln Cys Val Ile Glu Glu Cys AsnGlu 245 250 255 Ile Phe Ser Ala Phe Glu Val Val Val Ala Leu Thr Trp IleAla Arg 260 265 270 Thr Lys Ala Phe Gln Ile Pro His Asn Glu Asn Val MetMet Leu Phe 275 280 285 Gly Met Asp Ala Arg Lys Tyr Phe Asn Pro Pro LeuPro Lys Gly Tyr 290 295 300 Tyr Gly Asn Ala Ile Gly Thr Ser Cys Val IleGlu Asn Val Gln Asp 305 310 315 320 Leu Leu Asn Gly Ser Leu Ser Arg AlaVal Met Ile Thr Lys Lys Ser 325 330 335 Lys Ile Pro Leu Ile Glu Asn LeuArg Ser Arg Ile Val Ala Asn Gln 340 345 350 Ser Gly Val Asp Glu Glu IleLys His Glu Asn Val Val Gly Phe Gly 355 360 365 Asp Trp Arg Arg Leu GlyPhe His Glu Val Asp Phe Gly Ser Gly Asp 370 375 380 Ala Val Asn Ile SerPro Ile Gln Gln Arg Leu Glu Asp Asp Gln Leu 385 390 395 400 Ala Met ArgAsn Tyr Phe Leu Phe Leu Arg Pro Tyr Lys Asp Met Pro 405 410 415 Asn GlyIle Lys Ile Leu Met Phe Met Asp Pro Ser Arg Val Lys Leu 420 425 430 PheLys Asp Glu Met Glu Ala Met Ile Ile Lys Tyr Met Pro Lys Ala 435 440 44557 1317 DNA Taxus cuspidata 57 atggagaagt tacatgtgga tatcattgagagagtgaagg tggcgccatg ccttccatcg 60 tccaaagaaa ttctccagct ctccagcctcgacaacatac tcagatgtta tgtcagcgta 120 ttgttcgtct acgacagggt ttcaactgtttctgcaaatc ctgcaaaaac aattcgagag 180 gctctctcca aggttttggt ttattattcaccttttgctg gaaggctcag aaacaaagaa 240 aatggggatc ttgaagtgga gtgcagtggggagggtgctg tctttgtgga agccatggcg 300 gacaacgagc tttcagtctt acaagatttggatgagtact gtacatcgct taaacagcta 360 atttttacag taccaatgga tacgaaaattgaagacctcc atcttctaag tgttcaggta 420 actagtttta catgtggggg atttgttgtgggaataagtt tctaccatac tatatgtgat 480 ggaaaaggac tgggccagtt tcttcaaggcatgagtgaga tttccaaggg agcatttaaa 540 ccctcactag aaccagtatg gaatagagaaatggtgaagc ctgaacacct tatgttcctc 600 cagtttaata attttgaatt cgtaccacatcctcttaaat ttaagaagat tgttaaagca 660 tctattgaaa ttaactttga gacaataaattgtttcaagc aatgcatgat ggaagaatgt 720 aaagaaaatt tctctacatt tgaaattgtagcagcactga tttggctagc caagacaaag 780 tctttccaaa ttccagatag tgagaatgtgaaacttatgt ttgcagtcga catgaggaca 840 tcgtttgacc cccctcttcc aaagggatattatggtaatg ttattggtat tgcaggtgca 900 atagataatg tcaaagaact cttaagtggatcaattttgc gtgctctaat tattatccaa 960 aagacaattt tctctttaaa agataatttcatatcaagaa gattgatgaa accatctaca 1020 ttggatgtga atatgaagca tgaaaatgtagttctcttag gggattggag gaatttggga 1080 tattatgagg cagattgtgg gtgtggaaatctatcaaatg taattcccat ggatcaacaa 1140 atagagcatg agtcacctgt gcaaagtagatttatgttgc ttcgatcatc caagaacatg 1200 caaaatggaa tcaagatact aatgtccatgcctgaatcaa tggcgaaacc attcaaaagt 1260 gaaatgaaat tcacaataaa aaaatatgtgactggagcgt gtttctctga gttatga 1317 58 438 PRT Taxus cuspidata 58 Met GluLys Leu His Val Asp Ile Ile Glu Arg Val Lys Val Ala Pro 1 5 10 15 CysLeu Pro Ser Ser Lys Glu Ile Leu Gln Leu Ser Ser Leu Asp Asn 20 25 30 IleLeu Arg Cys Tyr Val Ser Val Leu Phe Val Tyr Asp Arg Val Ser 35 40 45 ThrVal Ser Ala Asn Pro Ala Lys Thr Ile Arg Glu Ala Leu Ser Lys 50 55 60 ValLeu Val Tyr Tyr Ser Pro Phe Ala Gly Arg Leu Arg Asn Lys Glu 65 70 75 80Asn Gly Asp Leu Glu Val Glu Cys Ser Gly Glu Gly Ala Val Phe Val 85 90 95Glu Ala Met Ala Asp Asn Glu Leu Ser Val Leu Gln Asp Leu Asp Glu 100 105110 Tyr Cys Thr Ser Leu Lys Gln Leu Ile Phe Thr Val Pro Met Asp Thr 115120 125 Lys Ile Glu Asp Leu His Leu Leu Ser Val Gln Val Thr Ser Phe Thr130 135 140 Cys Gly Gly Phe Val Val Gly Ile Ser Phe Tyr His Thr Ile CysAsp 145 150 155 160 Gly Lys Gly Leu Gly Gln Phe Leu Gln Gly Met Ser GluIle Ser Lys 165 170 175 Gly Ala Phe Lys Pro Ser Leu Glu Pro Val Trp AsnArg Glu Met Val 180 185 190 Lys Pro Glu His Leu Met Phe Leu Gln Phe AsnAsn Phe Glu Phe Val 195 200 205 Pro His Pro Leu Lys Phe Lys Lys Ile ValLys Ala Ser Ile Glu Ile 210 215 220 Asn Phe Glu Thr Ile Asn Cys Phe LysGln Cys Met Met Glu Glu Cys 225 230 235 240 Lys Glu Asn Phe Ser Thr PheGlu Ile Val Ala Ala Leu Ile Trp Leu 245 250 255 Ala Lys Thr Lys Ser PheGln Ile Pro Asp Ser Glu Asn Val Lys Leu 260 265 270 Met Phe Ala Val AspMet Arg Thr Ser Phe Asp Pro Pro Leu Pro Lys 275 280 285 Gly Tyr Tyr GlyAsn Val Ile Gly Ile Ala Gly Ala Ile Asp Asn Val 290 295 300 Lys Glu LeuLeu Ser Gly Ser Ile Leu Arg Ala Leu Ile Ile Ile Gln 305 310 315 320 LysThr Ile Phe Ser Leu Lys Asp Asn Phe Ile Ser Arg Arg Leu Met 325 330 335Lys Pro Ser Thr Leu Asp Val Asn Met Lys His Glu Asn Val Val Leu 340 345350 Leu Gly Asp Trp Arg Asn Leu Gly Tyr Tyr Glu Ala Asp Cys Gly Cys 355360 365 Gly Asn Leu Ser Asn Val Ile Pro Met Asp Gln Gln Ile Glu His Glu370 375 380 Ser Pro Val Gln Ser Arg Phe Met Leu Leu Arg Ser Ser Lys AsnMet 385 390 395 400 Gln Asn Gly Ile Lys Ile Leu Met Ser Met Pro Glu SerMet Ala Lys 405 410 415 Pro Phe Lys Ser Glu Met Lys Phe Thr Ile Lys LysTyr Val Thr Gly 420 425 430 Ala Cys Phe Ser Glu Leu 435 59 331 PRTArabidopsis thaliana 59 Met Ser Gln Ile Leu Glu Asn Pro Asn Pro Asn GluLeu Asn Lys Leu 1 5 10 15 His Pro Phe Glu Phe His Glu Val Ser Asp ValPro Leu Thr Val Gln 20 25 30 Leu Thr Phe Phe Glu Cys Gly Gly Leu Ala LeuGly Ile Gly Leu Ser 35 40 45 His Lys Leu Cys Asp Ala Leu Ser Gly Leu IlePhe Val Asn Ser Trp 50 55 60 Ala Ala Phe Ala Arg Gly Gln Thr Asp Glu IleIle Thr Pro Ser Phe 65 70 75 80 Asp Leu Ala Lys Met Phe Pro Pro Cys AspIle Glu Asn Leu Asn Met 85 90 95 Ala Thr Gly Ile Thr Lys Glu Asn Ile ValThr Arg Arg Phe Val Phe 100 105 110 Leu Arg Ser Ser Val Glu Ser Leu ArgGlu Arg Phe Ser Gly Asn Lys 115 120 125 Lys Ile Arg Ala Thr Arg Val GluVal Leu Ser Val Phe Ile Trp Ser 130 135 140 Arg Phe Met Ala Ser Thr AsnHis Asp Asp Lys Thr Gly Lys Ile Tyr 145 150 155 160 Thr Leu Ile His ProVal Asn Leu Arg Arg Gln Ala Asp Pro Asp Ile 165 170 175 Pro Asp Asn MetPhe Gly Asn Ile Met Arg Phe Ser Val Thr Val Pro 180 185 190 Met Met IleIle Asn Glu Asn Asp Glu Glu Lys Ala Ser Leu Val Asp 195 200 205 Gln MetArg Glu Glu Ile Arg Lys Ile Asp Ala Val Tyr Val Lys Lys 210 215 220 LeuGln Glu Asp Asn Arg Gly His Leu Glu Phe Leu Asn Lys Gln Ala 225 230 235240 Ser Gly Phe Val Asn Gly Glu Ile Val Ser Phe Ser Phe Thr Ser Leu 245250 255 Cys Lys Phe Pro Val Tyr Glu Ala Asp Phe Gly Trp Gly Lys Pro Leu260 265 270 Trp Val Ala Ser Ala Arg Met Ser Tyr Lys Asn Leu Val Ala PheIle 275 280 285 Asp Thr Lys Glu Gly Asp Gly Ile Glu Ala Trp Ile Asn LeuAsp Gln 290 295 300 Asn Asp Met Ser Arg Phe Glu Ala Asp Glu Glu Leu LeuArg Tyr Val 305 310 315 320 Ser Ser Asn Pro Ser Val Met Val Ser Val Ser325 330 60 435 PRT Arabidopsis thaliana 60 Met Glu Ala Lys Leu Glu ValThr Gly Lys Glu Val Ile Lys Pro Ala 1 5 10 15 Ser Pro Ser Pro Arg AspArg Leu Gln Leu Ser Ile Leu Asp Leu Tyr 20 25 30 Cys Pro Gly Ile Tyr ValSer Thr Ile Phe Phe Tyr Asp Leu Ile Thr 35 40 45 Glu Ser Ser Glu Val PheSer Glu Asn Leu Lys Leu Ser Leu Ser Glu 50 55 60 Thr Leu Ser Arg Phe TyrPro Leu Ala Gly Arg Ile Glu Gly Leu Ser 65 70 75 80 Ile Ser Cys Asn AspGlu Gly Ala Val Phe Thr Glu Ala Arg Thr Asp 85 90 95 Leu Leu Leu Pro AspPhe Leu Arg Asn Leu Asn Thr Asp Ser Leu Ser 100 105 110 Gly Phe Leu ProThr Leu Ala Ala Gly Glu Ser Pro Ala Ala Trp Pro 115 120 125 Leu Leu SerVal Lys Val Thr Phe Phe Gly Ser Gly Ser Gly Val Ala 130 135 140 Val SerVal Ser Val Ser His Lys Ile Cys Asp Ile Ala Ser Leu Val 145 150 155 160Thr Phe Val Lys Asp Trp Ala Thr Thr Thr Ala Lys Gly Lys Ser Asn 165 170175 Ser Thr Ile Glu Phe Ala Glu Thr Thr Ile Tyr Pro Pro Pro Pro Ser 180185 190 His Met Tyr Glu Gln Phe Pro Ser Thr Asp Ser Asp Ser Asn Ile Thr195 200 205 Ser Lys Tyr Val Leu Lys Arg Phe Val Phe Glu Pro Ser Lys IleAla 210 215 220 Glu Leu Lys His Lys Ala Ala Ser Glu Ser Val Pro Val ProThr Arg 225 230 235 240 Val Glu Ala Ile Met Ser Leu Ile Trp Arg Cys AlaArg Asn Ser Ser 245 250 255 Arg Ser Asn Leu Leu Ile Pro Arg Gln Ala ValMet Trp Gln Ala Met 260 265 270 Asp Ile Arg Leu Arg Ile Pro Ser Ser ValAla Pro Lys Asp Val Ile 275 280 285 Gly Asn Leu Gln Ser Gly Phe Ser LeuLys Lys Asp Ala Glu Ser Glu 290 295 300 Phe Glu Ile Pro Glu Ile Val AlaThr Phe Arg Lys Asn Lys Glu Arg 305 310 315 320 Val Asn Glu Met Ile LysGlu Ser Leu Gln Gly Asn Thr Ile Gly Gln 325 330 335 Ser Leu Leu Ser LeuMet Ala Glu Thr Val Ser Glu Ser Thr Glu Ile 340 345 350 Asp Arg Tyr IleMet Ser Ser Trp Cys Arg Lys Pro Phe Tyr Glu Val 355 360 365 Asp Phe GlySer Gly Ser Pro Val Trp Val Gly Tyr Ala Ser His Thr 370 375 380 Ile TyrAsp Asn Met Val Gly Val Val Leu Ile Asp Ser Lys Glu Gly 385 390 395 400Asp Gly Val Glu Ala Trp Ile Ser Leu Pro Glu Glu Asp Met Ser Val 405 410415 Phe Val Asp Asp Gln Glu Leu Leu Ala Tyr Ala Val Leu Asn Pro Pro 420425 430 Val Val Ala 435 61 458 PRT Arabidopsis thaliana 61 Met Pro MetLeu Met Ala Thr Arg Ile Asp Ile Ile Gln Lys Leu Asn 1 5 10 15 Val TyrPro Arg Phe Gln Asn His Asp Lys Lys Lys Leu Ile Thr Leu 20 25 30 Ser AsnLeu Asp Arg Gln Cys Pro Leu Leu Met Tyr Ser Val Phe Phe 35 40 45 Tyr LysAsn Thr Thr Thr Arg Asp Phe Asp Ser Val Phe Ser Asn Leu 50 55 60 Lys LeuGly Leu Glu Glu Thr Met Ser Val Trp Tyr Pro Ala Ala Gly 65 70 75 80 ArgLeu Gly Leu Asp Gly Gly Gly Cys Lys Leu Asn Ile Arg Cys Asn 85 90 95 AspGly Gly Ala Val Met Val Glu Ala Val Ala Thr Gly Val Lys Leu 100 105 110Ser Glu Leu Gly Asp Leu Thr Gln Tyr Asn Glu Phe Tyr Glu Asn Leu 115 120125 Val Tyr Lys Pro Ser Leu Asp Gly Asp Phe Ser Val Met Pro Leu Val 130135 140 Val Ala Gln Val Thr Arg Phe Ala Cys Gly Gly Tyr Ser Ile Gly Ile145 150 155 160 Gly Thr Ser His Ser Leu Phe Asp Gly Ile Ser Ala Tyr GluPhe Ile 165 170 175 His Ala Trp Ala Ser Asn Ser His Ile His Asn Lys SerAsn Ser Lys 180 185 190 Ile Thr Asn Lys Lys Glu Asp Val Val Ile Lys ProVal His Asp Arg 195 200 205 Arg Asn Leu Leu Val Asn Arg Asp Ala Val ArgGlu Thr Asn Ala Ala 210 215 220 Ala Ile Cys His Leu Tyr Gln Leu Ile LysGln Ala Met Met Thr Tyr 225 230 235 240 Gln Glu Gln Asn Arg Asn Leu GluLeu Pro Asp Ser Gly Phe Val Ile 245 250 255 Lys Thr Phe Glu Leu Asn GlyAsp Ala Ile Glu Ser Met Lys Lys Lys 260 265 270 Ser Leu Glu Gly Phe MetCys Ser Ser Phe Glu Phe Leu Ala Ala His 275 280 285 Leu Trp Lys Ala ArgThr Arg Ala Leu Gly Leu Arg Arg Asp Ala Met 290 295 300 Val Cys Leu GlnPhe Ala Val Asp Ile Arg Lys Arg Thr Glu Thr Pro 305 310 315 320 Leu ProGlu Gly Phe Ser Gly Asn Ala Tyr Val Leu Ala Ser Val Ala 325 330 335 SerThr Ala Arg Glu Leu Leu Glu Glu Leu Thr Leu Glu Ser Ile Val 340 345 350Asn Lys Ile Arg Glu Ala Lys Lys Ser Ile Asp Gln Gly Tyr Ile Asn 355 360365 Ser Tyr Met Glu Ala Leu Gly Gly Ser Asn Asp Gly Asn Leu Pro Pro 370375 380 Leu Lys Glu Leu Thr Leu Ile Ser Asp Trp Thr Lys Met Pro Phe His385 390 395 400 Asn Val Gly Phe Gly Asn Gly Gly Glu Pro Ala Asp Tyr MetAla Pro 405 410 415 Leu Cys Pro Pro Val Pro Gln Val Ala Tyr Phe Met LysAsn Pro Lys 420 425 430 Asp Ala Lys Gly Val Leu Val Arg Ile Gly Leu AspPro Arg Asp Val 435 440 445 Asn Gly Phe Ser Asn His Phe Leu Asp Cys 450455 62 436 PRT Arabidopsis thaliana 62 Met Glu Lys Asn Val Glu Ile LeuSer Arg Glu Ile Val Lys Pro Ser 1 5 10 15 Ser Pro Thr Pro Asp Asp LysArg Ile Leu Asn Leu Ser Leu Leu Asp 20 25 30 Ile Leu Ser Ser Pro Met TyrThr Gly Ala Leu Leu Phe Tyr Ala Ala 35 40 45 Asp Pro Gln Asn Leu Leu GlyPhe Ser Thr Glu Glu Thr Ser Leu Lys 50 55 60 Leu Lys Lys Ser Leu Ser LysThr Leu Pro Ile Phe Tyr Pro Leu Ala 65 70 75 80 Gly Arg Ile Ile Gly SerPhe Val Glu Cys Asn Asp Glu Gly Ala Val 85 90 95 Phe Ile Glu Ala Arg ValAsp His Leu Leu Ser Glu Phe Leu Lys Cys 100 105 110 Pro Val Pro Glu SerLeu Glu Leu Leu Ile Pro Val Glu Ala Lys Ser 115 120 125 Arg Glu Ala ValThr Trp Pro Val Leu Leu Ile Gln Ala Asn Phe Phe 130 135 140 Ser Cys GlyGly Leu Val Ile Thr Ile Cys Val Ser His Lys Ile Thr 145 150 155 160 AspAla Thr Ser Leu Ala Met Phe Ile Arg Gly Trp Ala Glu Ser Ser 165 170 175Arg Gly Leu Gly Ile Thr Leu Ile Pro Ser Phe Thr Ala Ser Glu Val 180 185190 Phe Pro Lys Pro Leu Asp Glu Leu Pro Ser Lys Pro Met Asp Arg Lys 195200 205 Glu Glu Val Glu Glu Met Ser Cys Val Thr Lys Arg Phe Val Phe Asp210 215 220 Ala Ser Lys Ile Lys Lys Leu Arg Ala Lys Ala Ser Arg Asn LeuVal 225 230 235 240 Lys Asn Pro Thr Arg Val Glu Ala Val Thr Ala Leu PheTrp Arg Cys 245 250 255 Val Thr Lys Val Ser Arg Leu Ser Ser Leu Thr ProArg Thr Ser Val 260 265 270 Leu Gln Ile Leu Val Asn Leu Arg Gly Lys ValAsp Ser Leu Cys Glu 275 280 285 Asn Thr Ile Gly Asn Met Leu Ser Leu MetIle Leu Lys Asn Glu Glu 290 295 300 Ala Ala Ile Glu Arg Ile Gln Asp ValVal Asp Glu Ile Arg Arg Ala 305 310 315 320 Lys Glu Ile Phe Ser Leu AsnCys Lys Glu Met Ser Lys Ser Ser Ser 325 330 335 Arg Ile Phe Glu Leu LeuGlu Glu Ile Gly Lys Val Tyr Gly Arg Gly 340 345 350 Asn Glu Met Asp LeuTrp Met Ser Asn Ser Trp Cys Lys Leu Gly Leu 355 360 365 Tyr Asp Ala AspPhe Gly Trp Gly Lys Pro Val Trp Val Thr Gly Arg 370 375 380 Gly Thr SerHis Phe Lys Asn Leu Met Leu Leu Ile Asp Thr Lys Asp 385 390 395 400 GlyGlu Gly Ile Glu Ala Trp Ile Thr Leu Thr Glu Glu Gln Met Ser 405 410 415Leu Phe Glu Cys Asp Gln Glu Leu Leu Glu Ser Ala Ser Leu Asn Pro 420 425430 Pro Val Leu Ile 435 63 482 PRT Arabidopsis thaliana 63 Met Pro SerLeu Glu Lys Ser Val Thr Ile Ile Ser Arg Asn Arg Val 1 5 10 15 Phe ProAsp Gln Lys Ser Thr Leu Val Asp Leu Lys Leu Ser Val Ser 20 25 30 Asp LeuPro Met Leu Ser Cys His Tyr Ile Gln Lys Gly Cys Leu Phe 35 40 45 Thr CysPro Asn Leu Pro Leu Pro Ala Leu Ile Ser His Leu Lys His 50 55 60 Ser LeuSer Ile Thr Leu Thr His Phe Pro Pro Leu Ala Gly Arg Leu 65 70 75 80 SerThr Ser Ser Ser Gly His Val Phe Leu Thr Cys Asn Asp Ala Gly 85 90 95 AlaAsp Phe Val Phe Ala Gln Ala Lys Ser Ile His Val Ser Asp Val 100 105 110Ile Ala Gly Ile Asp Val Pro Asp Val Val Lys Glu Phe Phe Thr Tyr 115 120125 Asp Arg Ala Val Ser Tyr Glu Gly His Asn Arg Pro Ile Leu Ala Val 130135 140 Gln Val Thr Glu Leu Asn Asp Gly Val Phe Ile Gly Cys Ser Val Asn145 150 155 160 His Ala Val Thr Asp Gly Thr Ser Leu Trp Asn Phe Ile AsnThr Phe 165 170 175 Ala Glu Val Ser Arg Gly Ala Lys Asn Val Thr Arg GlnPro Asp Phe 180 185 190 Thr Arg Glu Ser Val Leu Ile Ser Pro Ala Val LeuLys Val Pro Gln 195 200 205 Gly Gly Pro Lys Val Thr Phe Asp Glu Asn AlaPro Leu Arg Glu Arg 210 215 220 Ile Phe Ser Phe Ser Arg Glu Ser Ile GlnGlu Leu Lys Ala Val Val 225 230 235 240 Asn Lys Lys Lys Trp Leu Thr ValAsp Asn Gly Glu Ile Asp Gly Val 245 250 255 Glu Leu Leu Gly Lys Gln SerAsn Asp Lys Leu Asn Gly Lys Glu Asn 260 265 270 Gly Ile Leu Thr Glu MetLeu Glu Ser Leu Phe Gly Arg Asn Asp Ala 275 280 285 Val Ser Lys Pro ValAla Val Glu Ile Ser Ser Phe Gln Ser Leu Cys 290 295 300 Ala Leu Leu TrpArg Ala Ile Thr Arg Ala Arg Lys Leu Pro Ser Ser 305 310 315 320 Lys ThrThr Thr Phe Arg Met Ala Val Asn Cys Arg His Arg Leu Ser 325 330 335 ProLys Leu Asn Pro Glu Tyr Phe Gly Asn Ala Ile Gln Ser Val Pro 340 345 350Thr Phe Ala Thr Ala Ala Glu Val Leu Ser Arg Asp Leu Lys Trp Cys 355 360365 Ala Asp Gln Leu Asn Gln Ser Val Ala Ala His Gln Asp Gly Arg Ile 370375 380 Arg Ser Val Val Ala Asp Trp Glu Ala Asn Pro Arg Cys Phe Pro Leu385 390 395 400 Gly Asn Ala Asp Gly Ala Ser Val Thr Met Gly Ser Ser ProArg Phe 405 410 415 Pro Met Tyr Asp Asn Asp Phe Gly Trp Gly Arg Pro ValAla Val Arg 420 425 430 Ser Gly Arg Ser Asn Lys Phe Asp Gly Lys Ile SerAla Phe Pro Gly 435 440 445 Arg Glu Gly Asn Gly Thr Val Asp Leu Glu ValVal Leu Ser Pro Glu 450 455 460 Thr Met Ala Gly Ile Glu Ser Asp Gly GluPhe Met Arg Tyr Val Thr 465 470 475 480 Asn Lys 64 461 PRT Arabidopsisthaliana 64 Met Ala Ser Cys Ile Gln Glu Leu His Phe Thr His Leu His IlePro 1 5 10 15 Val Thr Ile Asn Gln Gln Phe Leu Val His Pro Ser Ser ProThr Pro 20 25 30 Ala Asn Gln Ser Pro His His Ser Leu Tyr Leu Ser Asn LeuAsp Asp 35 40 45 Ile Ile Gly Ala Arg Val Phe Thr Pro Ser Val Tyr Phe TyrPro Ser 50 55 60 Thr Asn Asn Arg Glu Ser Phe Val Leu Lys Arg Leu Gln AspAla Leu 65 70 75 80 Ser Glu Val Leu Val Pro Tyr Tyr Pro Leu Ser Gly ArgLeu Arg Glu 85 90 95 Val Glu Asn Gly Lys Leu Glu Val Phe Phe Gly Glu GluGln Gly Val 100 105 110 Leu Met Val Ser Ala Asn Ser Ser Met Asp Leu AlaAsp Leu Gly Asp 115 120 125 Leu Thr Val Pro Asn Pro Ala Trp Leu Pro LeuIle Phe Arg Asn Pro 130 135 140 Gly Glu Glu Ala Tyr Lys Ile Leu Glu MetPro Leu Leu Ile Ala Gln 145 150 155 160 Val Thr Phe Phe Thr Cys Gly GlyPhe Ser Leu Gly Ile Arg Leu Cys 165 170 175 His Cys Ile Cys Asp Gly PheGly Ala Met Gln Phe Leu Gly Ser Trp 180 185 190 Ala Ala Thr Ala Lys ThrGly Lys Leu Ile Ala Asp Pro Glu Pro Val 195 200 205 Trp Asp Arg Glu ThrPhe Lys Pro Arg Asn Pro Pro Met Val Lys Tyr 210 215 220 Pro His His GluTyr Leu Pro Ile Glu Glu Arg Ser Asn Leu Thr Asn 225 230 235 240 Ser LeuTrp Asp Thr Lys Pro Leu Gln Lys Cys Tyr Arg Ile Ser Lys 245 250 255 GluPhe Gln Cys Arg Val Lys Ser Ile Ala Gln Gly Glu Asp Pro Thr 260 265 270Leu Val Cys Ser Thr Phe Asp Ala Met Ala Ala His Ile Trp Arg Ser 275 280285 Trp Val Lys Ala Leu Asp Val Lys Pro Leu Asp Tyr Asn Leu Arg Leu 290295 300 Thr Phe Ser Val Asn Val Arg Thr Arg Leu Glu Thr Leu Lys Leu Arg305 310 315 320 Lys Gly Phe Tyr Gly Asn Val Val Cys Leu Ala Cys Ala MetSer Ser 325 330 335 Val Glu Ser Leu Ile Asn Asp Ser Leu Ser Lys Thr ThrArg Leu Val 340 345 350 Gln Asp Ala Arg Leu Arg Val Ser Glu Asp Tyr LeuArg Ser Met Val 355 360 365 Asp Tyr Val Asp Val Lys Arg Pro Lys Arg LeuGlu Phe Gly Gly Lys 370 375 380 Leu Thr Ile Thr Gln Trp Thr Arg Phe GluMet Tyr Glu Thr Ala Asp 385 390 395 400 Phe Gly Trp Gly Lys Pro Val TyrAla Gly Pro Ile Asp Leu Arg Pro 405 410 415 Thr Pro Gln Val Cys Val LeuLeu Pro Gln Gly Gly Val Glu Ser Gly 420 425 430 Asn Asp Gln Ser Met ValVal Cys Leu Cys Leu Pro Pro Thr Ala Val 435 440 445 His Thr Phe Thr ArgLeu Leu Ser Leu Asn Asp His Lys 450 455 460 65 497 PRT Arabidopsisthaliana 65 Ala Trp Gln Ile Glu Gly Ile Gln Val Thr Val Ser Cys Phe PheVal 1 5 10 15 Thr Cys Gly Lys Thr Arg Ser Ser Ser Asn Asn Pro His HisThr Thr 20 25 30 Phe Phe Ile Leu Ser Glu Asn Asn Asn Gln Met Gly Glu AlaAla Glu 35 40 45 Gln Ala Arg Gly Phe His Val Thr Thr Thr Arg Lys Gln ValIle Thr 50 55 60 Ala Ala Leu Pro Leu Gln Asp His Trp Leu Pro Leu Ser AsnLeu Asp 65 70 75 80 Leu Leu Leu Pro Pro Leu Asn Val His Val Cys Phe CysTyr Lys Lys 85 90 95 Pro Leu His Phe Thr Asn Thr Val Ala Tyr Glu Thr LeuLys Thr Ala 100 105 110 Leu Ala Glu Thr Leu Val Ser Tyr Tyr Ala Phe AlaGly Glu Leu Val 115 120 125 Thr Asn Pro Thr Gly Glu Pro Glu Ile Leu CysAsn Asn Arg Gly Val 130 135 140 Asp Phe Val Glu Ala Gly Ala Asp Val GluLeu Arg Glu Leu Asn Leu 145 150 155 160 Tyr Asp Pro Asp Glu Ser Ile AlaLys Leu Val Pro Ile Lys Lys His 165 170 175 Gly Val Ile Ala Ile Gln ValThr Gln Leu Lys Cys Gly Ser Ile Val 180 185 190 Val Gly Cys Thr Phe AspHis Arg Val Ala Asp Ala Tyr Ser Met Asn 195 200 205 Met Phe Leu Leu SerTrp Ala Glu Ile Ser Arg Ser Asp Val Pro Ile 210 215 220 Ser Cys Val ProSer Phe Arg Arg Ser Leu Leu Asn Pro Arg Arg Pro 225 230 235 240 Leu ValMet Asp Pro Ser Ile Asp Gln Ile Tyr Met Pro Val Thr Ser 245 250 255 LeuPro Pro Pro Gln Glu Thr Thr Asn Pro Glu Asn Leu Leu Ala Ser 260 265 270Arg Ile Tyr Tyr Ile Lys Ala Asn Ala Leu Gln Glu Leu Gln Thr Leu 275 280285 Ala Ser Ser Ser Lys Asn Gly Lys Arg Thr Lys Leu Glu Ser Phe Ser 290295 300 Ala Phe Leu Trp Lys Leu Val Ala Glu His Ala Ala Lys Asp Pro Val305 310 315 320 Pro Ile Lys Thr Ser Lys Leu Gly Ile Val Val Asp Gly ArgArg Arg 325 330 335 Leu Met Glu Lys Glu Asn Asn Thr Tyr Phe Gly Asn ValLeu Ser Val 340 345 350 Pro Phe Gly Gly Gln Arg Ile Asp Asp Leu Ile SerLys Pro Leu Ser 355 360 365 Trp Val Thr Glu Glu Val His Arg Phe Leu LysLys Ser Val Thr Lys 370 375 380 Glu His Phe Leu Asn Leu Ile Asp Trp ValGlu Thr Cys Arg Pro Thr 385 390 395 400 Pro Ala Val Ser Arg Ile Tyr SerVal Gly Ser Asp Asp Gly Pro Ala 405 410 415 Phe Val Val Ser Ser Gly ArgSer Phe Pro Val Asn Gln Val Asp Phe 420 425 430 Gly Trp Gly Ser Pro ValPhe Gly Ser Tyr His Phe Pro Trp Gly Gly 435 440 445 Ser Ala Gly Tyr ValMet Pro Met Pro Ser Ser Val Asp Asp Arg Asp 450 455 460 Trp Met Val TyrLeu His Leu Thr Lys Gly Gln Leu Arg Phe Ile Glu 465 470 475 480 Glu GluAla Ser His Val Leu Lys Pro Ile Asp Asn Asp Tyr Leu Lys 485 490 495 Ile66 433 PRT Clarkia breweri 66 Met Asn Val Thr Met His Ser Lys Lys LeuLeu Lys Pro Ser Ile Pro 1 5 10 15 Thr Pro Asn His Leu Gln Lys Leu AsnLeu Ser Leu Leu Asp Gln Ile 20 25 30 Gln Ile Pro Phe Tyr Val Gly Leu IlePhe His Tyr Glu Thr Leu Ser 35 40 45 Asp Asn Ser Asp Ile Thr Leu Ser LysLeu Glu Ser Ser Leu Ser Glu 50 55 60 Thr Leu Thr Leu Tyr Tyr His Val AlaGly Arg Tyr Asn Gly Thr Asp 65 70 75 80 Cys Val Ile Glu Cys Asn Asp GlnGly Ile Gly Tyr Val Glu Thr Ala 85 90 95 Phe Asp Val Glu Leu His Gln PheLeu Leu Gly Glu Glu Ser Asn Asn 100 105 110 Leu Asp Leu Leu Val Gly LeuSer Gly Phe Leu Ser Glu Thr Glu Thr 115 120 125 Pro Pro Leu Ala Ala IleGln Leu Asn Met Phe Lys Cys Gly Gly Leu 130 135 140 Val Ile Gly Ala GlnPhe Asn His Ile Ile Gly Asp Met Phe Thr Met 145 150 155 160 Ser Thr PheMet Asn Ser Trp Ala Lys Ala Cys Arg Val Gly Ile Lys 165 170 175 Glu ValAla His Pro Thr Phe Gly Leu Ala Pro Leu Met Pro Ser Ala 180 185 190 LysVal Leu Asn Ile Pro Pro Pro Pro Ser Phe Glu Gly Val Lys Phe 195 200 205Val Ser Lys Arg Phe Val Phe Asn Glu Asn Ala Ile Thr Arg Leu Arg 210 215220 Lys Glu Ala Thr Glu Glu Asp Gly Asp Gly Asp Asp Asp Gln Lys Lys 225230 235 240 Lys Arg Pro Ser Arg Val Asp Leu Val Thr Ala Phe Leu Ser LysSer 245 250 255 Leu Ile Glu Met Asp Cys Ala Lys Lys Glu Gln Thr Lys SerArg Pro 260 265 270 Ser Leu Met Val His Met Met Asn Leu Arg Lys Arg ThrLys Leu Ala 275 280 285 Leu Glu Asn Asp Val Ser Gly Asn Phe Phe Ile ValVal Asn Ala Glu 290 295 300 Ser Lys Ile Thr Val Ala Pro Lys Ile Thr AspLeu Thr Glu Ser Leu 305 310 315 320 Gly Ser Ala Cys Gly Glu Ile Ile SerGlu Val Ala Lys Val Asp Asp 325 330 335 Ala Glu Val Val Ser Ser Met ValLeu Asn Ser Val Arg Glu Phe Tyr 340 345 350 Tyr Glu Trp Gly Lys Gly GluLys Asn Val Phe Leu Tyr Thr Ser Trp 355 360 365 Cys Arg Phe Pro Leu TyrGlu Val Asp Phe Gly Trp Gly Ile Pro Ser 370 375 380 Leu Val Asp Thr ThrAla Val Pro Phe Gly Leu Ile Val Leu Met Asp 385 390 395 400 Glu Ala ProAla Gly Asp Gly Ile Ala Val Arg Ala Cys Leu Ser Glu 405 410 415 His AspMet Ile Gln Phe Gln Gln His His Gln Leu Leu Ser Tyr Val 420 425 430 Ser67 450 PRT Dianthus caryophyllus 67 Met Gly Ser Ser Tyr Gln Glu Ser ProPro Leu Leu Leu Glu Asp Leu 1 5 10 15 Lys Val Thr Ile Lys Glu Ser ThrLeu Ile Phe Pro Ser Glu Glu Thr 20 25 30 Ser Glu Arg Lys Ser Met Phe LeuSer Asn Val Asp Gln Ile Leu Asn 35 40 45 Phe Asp Val Gln Thr Val His PhePhe Arg Pro Asn Lys Glu Phe Pro 50 55 60 Pro Glu Met Val Ser Glu Lys LeuArg Lys Ala Leu Val Lys Leu Met 65 70 75 80 Asp Ala Tyr Glu Phe Leu AlaGly Arg Leu Arg Val Asp Pro Ser Ser 85 90 95 Gly Arg Leu Asp Val Asp CysAsn Gly Ala Gly Ala Gly Phe Val Thr 100 105 110 Ala Ala Ser Asp Tyr ThrLeu Glu Glu Leu Gly Asp Leu Val Tyr Pro 115 120 125 Asn Pro Ala Phe AlaGln Leu Val Thr Ser Gln Leu Gln Ser Leu Pro 130 135 140 Lys Asp Asp GlnPro Leu Phe Val Phe Gln Ile Thr Ser Phe Lys Cys 145 150 155 160 Gly GlyPhe Ala Met Gly Ile Ser Thr Asn His Thr Thr Phe Asp Gly 165 170 175 LeuSer Phe Lys Thr Phe Leu Glu Asn Leu Ala Ser Leu Leu His Glu 180 185 190Lys Pro Leu Ser Thr Pro Pro Cys Asn Asp Arg Thr Leu Leu Lys Ala 195 200205 Arg Asp Pro Pro Ser Val Ala Phe Pro His His Glu Leu Val Lys Phe 210215 220 Gln Asp Cys Glu Thr Thr Thr Val Phe Glu Ala Thr Ser Glu His Leu225 230 235 240 Asp Phe Lys Ile Phe Lys Leu Ser Ser Glu Gln Ile Lys LysLeu Lys 245 250 255 Glu Arg Ala Ser Glu Thr Ser Asn Gly Asn Val Arg ValThr Gly Phe 260 265 270 Asn Val Val Thr Ala Leu Val Trp Arg Cys Lys AlaLeu Ser Val Ala 275 280 285 Ala Glu Glu Gly Glu Glu Thr Asn Leu Glu ArgGlu Ser Thr Ile Leu 290 295 300 Tyr Ala Val Asp Ile Arg Gly Arg Leu AsnPro Glu Leu Pro Pro Ser 305 310 315 320 Tyr Thr Gly Asn Ala Val Leu ThrAla Tyr Ala Lys Glu Lys Cys Lys 325 330 335 Ala Leu Leu Glu Glu Pro PheGly Arg Ile Val Glu Met Val Gly Glu 340 345 350 Gly Ser Lys Arg Ile ThrAsp Glu Tyr Ala Arg Ser Ala Ile Asp Trp 355 360 365 Gly Glu Leu Tyr LysGly Phe Pro His Gly Glu Val Leu Val Ser Ser 370 375 380 Trp Trp Lys LeuGly Phe Ala Glu Val Glu Tyr Pro Trp Gly Lys Pro 385 390 395 400 Lys TyrSer Cys Pro Val Val Tyr His Arg Lys Asp Ile Val Leu Leu 405 410 415 PhePro Asp Ile Asp Gly Asp Ser Lys Gly Val Tyr Val Leu Ala Ala 420 425 430Leu Pro Ser Lys Glu Met Ser Lys Phe Gln His Trp Phe Glu Asp Thr 435 440445 Leu Cys 450 68 439 PRT Catharanthus roseus 68 Met Glu Ser Gly LysIle Ser Val Glu Thr Glu Thr Leu Ser Lys Thr 1 5 10 15 Leu Ile Lys ProSer Ser Pro Thr Pro Gln Ser Leu Ser Arg Tyr Asn 20 25 30 Leu Ser Tyr AsnAsp Gln Asn Ile Tyr Gln Thr Cys Val Ser Val Gly 35 40 45 Phe Phe Tyr GluAsn Pro Asp Gly Ile Glu Ile Ser Thr Ile Arg Glu 50 55 60 Gln Leu Gln AsnSer Leu Ser Lys Thr Leu Val Ser Tyr Tyr Pro Phe 65 70 75 80 Ala Gly LysVal Val Lys Asn Asp Tyr Ile His Cys Asn Asp Asp Gly 85 90 95 Ile Glu PheVal Glu Val Arg Ile Arg Cys Arg Met Asn Asp Ile Leu 100 105 110 Lys TyrGlu Leu Arg Ser Tyr Ala Arg Asp Leu Val Leu Pro Lys Arg 115 120 125 ValThr Val Gly Ser Glu Asp Thr Thr Ala Ile Val Gln Leu Ser His 130 135 140Phe Asp Cys Gly Gly Leu Ala Val Ala Phe Gly Ile Ser His Lys Val 145 150155 160 Ala Asp Gly Gly Thr Ile Ala Ser Phe Met Lys Asp Trp Ala Ala Ser165 170 175 Ala Cys Tyr Leu Ser Ser Ser His His Val Pro Thr Pro Leu LeuVal 180 185 190 Ser Asp Ser Ile Phe Pro Arg Gln Asp Asn Ile Ile Cys GluGln Phe 195 200 205 Pro Thr Ser Lys Asn Cys Val Glu Lys Thr Phe Ile PhePro Pro Glu 210 215 220 Ala Ile Glu Lys Leu Lys Ser Lys Ala Val Glu PheGly Ile Glu Lys 225 230 235 240 Pro Thr Arg Val Glu Val Leu Thr Ala PheLeu Ser Arg Cys Ala Thr 245 250 255 Val Ala Gly Lys Ser Ala Ala Lys AsnAsn Asn Cys Gly Gln Ser Leu 260 265 270 Pro Phe Pro Val Leu Gln Ala IleAsn Leu Arg Pro Ile Leu Glu Leu 275 280 285 Pro Gln Asn Ser Val Gly AsnLeu Val Ser Ile Tyr Phe Ser Arg Thr 290 295 300 Ile Lys Glu Asn Asp TyrLeu Asn Glu Lys Glu Tyr Thr Lys Leu Val 305 310 315 320 Ile Asn Glu LeuArg Lys Glu Lys Gln Lys Ile Lys Asn Leu Ser Arg 325 330 335 Glu Lys LeuThr Tyr Val Ala Gln Met Glu Glu Phe Val Lys Ser Leu 340 345 350 Lys GluPhe Asp Ile Ser Asn Phe Leu Asp Ile Asp Ala Tyr Leu Ser 355 360 365 AspSer Trp Cys Arg Phe Pro Phe Tyr Asp Val Asp Phe Gly Trp Gly 370 375 380Lys Pro Ile Trp Val Cys Leu Phe Gln Pro Tyr Ile Lys Asn Cys Val 385 390395 400 Val Met Met Asp Tyr Pro Phe Gly Asp Asp Tyr Gly Ile Glu Ala Ile405 410 415 Val Ser Phe Glu Gln Glu Lys Met Ser Ala Phe Glu Lys Asn GluGln 420 425 430 Leu Leu Gln Phe Val Ser Asn 435 69 451 PRT Arabidopsisthaliana 69 Met Ala Pro Ile Thr Phe Arg Lys Ser Tyr Thr Ile Val Pro AlaGlu 1 5 10 15 Pro Thr Trp Ser Gly Arg Phe Pro Leu Ala Glu Trp Asp GlnVal Gly 20 25 30 Thr Ile Thr His Ile Pro Thr Leu Tyr Phe Tyr Asp Lys ProSer Glu 35 40 45 Ser Phe Gln Gly Asn Val Val Glu Ile Leu Lys Thr Ser LeuSer Arg 50 55 60 Val Leu Val His Phe Tyr Pro Met Ala Gly Arg Leu Arg TrpLeu Pro 65 70 75 80 Arg Gly Arg Phe Glu Leu Asn Cys Asn Ala Glu Gly ValGlu Phe Ile 85 90 95 Glu Ala Glu Ser Glu Gly Lys Leu Ser Asp Phe Lys AspPhe Ser Pro 100 105 110 Thr Pro Glu Phe Glu Asn Leu Met Pro Gln Val AsnTyr Lys Asn Pro 115 120 125 Ile Glu Thr Ile Pro Leu Phe Leu Ala Gln ValThr Lys Phe Lys Cys 130 135 140 Gly Gly Ile Ser Leu Ser Val Asn Val SerHis Ala Ile Val Asp Gly 145 150 155 160 Gln Ser Ala Leu His Leu Ile SerGlu Trp Gly Arg Leu Ala Arg Gly 165 170 175 Glu Pro Leu Glu Thr Val ProPhe Leu Asp Arg Lys Ile Leu Trp Ala 180 185 190 Gly Glu Pro Leu Pro ProPhe Val Ser Pro Pro Lys Phe Asp His Lys 195 200 205 Glu Phe Asp Gln ProPro Phe Leu Ile Gly Glu Thr Asp Asn Val Glu 210 215 220 Glu Arg Lys LysLys Thr Ile Val Val Met Leu Pro Leu Ser Thr Ser 225 230 235 240 Gln LeuGln Lys Leu Arg Ser Lys Ala Asn Gly Ser Lys His Ser Asp 245 250 255 ProAla Lys Gly Phe Thr Arg Tyr Glu Thr Val Thr Gly His Val Trp 260 265 270Arg Cys Ala Cys Lys Ala Arg Gly His Ser Pro Glu Gln Pro Thr Ala 275 280285 Leu Gly Ile Cys Ile Asp Thr Arg Ser Arg Met Glu Pro Pro Leu Pro 290295 300 Arg Gly Tyr Phe Gly Asn Ala Thr Leu Asp Val Val Ala Ala Ser Thr305 310 315 320 Ser Gly Glu Leu Ile Ser Asn Glu Leu Gly Phe Ala Ala SerLeu Ile 325 330 335 Ser Lys Ala Ile Lys Asn Val Thr Asn Glu Tyr Val MetIle Gly Ile 340 345 350 Glu Tyr Leu Lys Asn Gln Lys Asp Leu Lys Lys PheGln Asp Leu His 355 360 365 Ala Leu Gly Ser Thr Glu Gly Pro Phe Tyr GlyAsn Pro Asn Leu Gly 370 375 380 Val Val Ser Trp Leu Thr Leu Pro Met TyrGly Leu Asp Phe Gly Trp 385 390 395 400 Gly Lys Glu Phe Tyr Thr Gly ProGly Thr His Asp Phe Asp Gly Asp 405 410 415 Ser Leu Ile Leu Pro Asp GlnAsn Glu Asp Gly Ser Val Ile Leu Ala 420 425 430 Thr Cys Leu Gln Val AlaHis Met Glu Ala Phe Lys Lys His Phe Tyr 435 440 445 Glu Asp Ile 450 70461 PRT Arabidopsis thaliana 70 Met Ala Asn Gln Arg Lys Pro Ile Leu ProLeu Leu Leu Glu Lys Lys 1 5 10 15 Pro Val Glu Leu Val Lys Pro Ser LysHis Thr His Cys Glu Thr Leu 20 25 30 Ser Leu Ser Thr Leu Asp Asn Asp ProPhe Asn Glu Val Met Tyr Ala 35 40 45 Thr Ile Tyr Val Phe Lys Ala Asn GlyLys Asn Leu Asp Asp Pro Val 50 55 60 Ser Leu Leu Arg Lys Ala Leu Ser GluLeu Leu Val His Tyr Tyr Pro 65 70 75 80 Leu Ser Gly Lys Leu Met Arg SerGlu Ser Asn Gly Lys Leu Gln Leu 85 90 95 Val Tyr Leu Gly Glu Gly Val ProPhe Glu Val Ala Thr Ser Thr Leu 100 105 110 Asp Leu Ser Ser Leu Asn TyrIle Glu Asn Leu Asp Asp Gln Val Ala 115 120 125 Leu Arg Leu Val Pro GluIle Glu Ile Asp Tyr Glu Ser Asn Val Cys 130 135 140 Tyr His Pro Leu AlaLeu Gln Val Thr Lys Phe Ala Cys Gly Gly Phe 145 150 155 160 Thr Ile GlyThr Ala Leu Thr His Ala Val Cys Asp Gly Tyr Gly Val 165 170 175 Ala GlnIle Ile His Ala Leu Thr Glu Leu Ala Ala Gly Lys Thr Glu 180 185 190 ProSer Val Lys Ser Val Trp Gln Arg Glu Arg Leu Val Gly Lys Ile 195 200 205Asp Asn Lys Pro Gly Lys Val Pro Gly Ser His Ile Asp Gly Phe Leu 210 215220 Ala Thr Ser Ala Tyr Leu Pro Thr Thr Asp Val Val Thr Glu Thr Ile 225230 235 240 Asn Ile Arg Ala Gly Asp Ile Lys Arg Leu Lys Asp Ser Met MetLys 245 250 255 Glu Cys Glu Tyr Leu Lys Glu Ser Phe Thr Thr Tyr Glu ValLeu Ser 260 265 270 Ser Tyr Ile Trp Lys Leu Arg Ser Arg Ala Leu Lys LeuAsn Pro Asp 275 280 285 Gly Ile Thr Val Leu Gly Val Ala Val Gly Ile ArgHis Val Leu Asp 290 295 300 Pro Pro Leu Pro Lys Gly Tyr Tyr Gly Asn AlaTyr Ile Asp Val Tyr 305 310 315 320 Val Glu Leu Thr Val Arg Glu Leu GluGlu Ser Ser Ile Ser Asn Ile 325 330 335 Ala Asn Arg Val Lys Lys Ala LysLys Thr Ala Tyr Glu Lys Gly Tyr 340 345 350 Ile Glu Glu Glu Leu Lys AsnThr Glu Arg Leu Met Arg Asp Asp Ser 355 360 365 Met Phe Glu Gly Val SerAsp Gly Leu Phe Phe Leu Thr Asp Trp Arg 370 375 380 Asn Ile Gly Trp PheGly Ser Met Asp Phe Gly Trp Asn Glu Pro Val 385 390 395 400 Asn Leu ArgPro Leu Thr Gln Arg Glu Ser Thr Val His Val Gly Met 405 410 415 Ile LeuLys Pro Ser Lys Ser Asp Pro Ser Met Glu Gly Gly Val Lys 420 425 430 ValIle Met Lys Leu Pro Arg Asp Ala Met Val Glu Phe Lys Arg Glu 435 440 445Met Ala Thr Met Lys Lys Leu Tyr Phe Gly Asp Thr Asn 450 455 460 71 460PRT Nicotiana tabacum 71 Met Asp Ser Lys Gln Ser Ser Glu Leu Val Phe ThrVal Arg Arg Gln 1 5 10 15 Lys Pro Glu Leu Ile Ala Pro Ala Lys Pro ThrPro Arg Glu Thr Lys 20 25 30 Phe Leu Ser Asp Ile Asp Asp Gln Glu Gly LeuArg Phe Gln Ile Pro 35 40 45 Val Ile Gln Phe Tyr His Lys Asp Ser Ser MetGly Arg Lys Asp Pro 50 55 60 Val Lys Val Ile Lys Lys Ala Ile Ala Glu ThrLeu Val Phe Tyr Tyr 65 70 75 80 Pro Phe Ala Gly Arg Leu Arg Glu Gly AsnGly Arg Lys Leu Met Val 85 90 95 Asp Cys Thr Gly Glu Gly Ile Met Phe ValGlu Ala Asp Ala Asp Val 100 105 110 Thr Leu Glu Gln Phe Gly Asp Glu LeuGln Pro Pro Phe Pro Cys Leu 115 120 125 Glu Glu Leu Leu Tyr Asp Val ProAsp Ser Ala Gly Val Leu Asn Cys 130 135 140 Pro Leu Leu Leu Ile Gln ValThr Arg Leu Arg Cys Gly Gly Phe Ile 145 150 155 160 Phe Ala Leu Arg LeuAsn His Thr Met Ser Asp Ala Pro Gly Leu Val 165 170 175 Gln Phe Met ThrAla Val Gly Glu Met Ala Arg Gly Gly Ser Ala Pro 180 185 190 Ser Ile LeuPro Val Trp Cys Arg Glu Leu Leu Asn Ala Arg Asn Pro 195 200 205 Pro GlnVal Thr Cys Thr His His Glu Tyr Asp Glu Val Arg Asp Thr 210 215 220 LysGly Thr Ile Ile Pro Leu Asp Asp Met Val His Lys Ser Phe Phe 225 230 235240 Phe Gly Pro Ser Glu Val Ser Ala Leu Arg Arg Phe Val Pro His His 245250 255 Leu Arg Lys Cys Ser Thr Phe Glu Leu Leu Thr Ala Val Leu Trp Arg260 265 270 Cys Arg Thr Met Ser Leu Lys Pro Asp Pro Glu Glu Glu Val ArgAla 275 280 285 Leu Cys Ile Val Asn Ala Arg Ser Arg Phe Asn Pro Pro LeuPro Thr 290 295 300 Gly Tyr Tyr Gly Asn Ala Phe Ala Phe Pro Val Ala ValThr Thr Ala 305 310 315 320 Ala Lys Leu Ser Lys Asn Pro Leu Gly Tyr AlaLeu Glu Leu Val Lys 325 330 335 Lys Thr Lys Ser Asp Val Thr Glu Glu TyrMet Lys Ser Val Ala Asp 340 345 350 Leu Met Val Leu Lys Gly Arg Pro HisPhe Thr Val Val Arg Thr Phe 355 360 365 Leu Val Ser Asp Val Thr Arg GlyGly Phe Gly Glu Val Asp Phe Gly 370 375 380 Trp Gly Lys Ala Val Tyr GlyGly Pro Ala Lys Gly Gly Val Gly Ala 385 390 395 400 Ile Pro Gly Val AlaSer Phe Tyr Ile Pro Phe Lys Asn Lys Lys Gly 405 410 415 Glu Asn Gly IleVal Val Pro Ile Cys Leu Pro Gly Phe Ala Met Glu 420 425 430 Thr Phe ValLys Glu Leu Asp Gly Met Leu Lys Val Asp Ala Pro Leu 435 440 445 Val AsnSer Asn Tyr Ala Ile Ile Arg Pro Ala Leu 450 455 460 72 455 PRT Cucumismelo 72 Asp Phe Ser Phe His Val Arg Lys Cys Gln Pro Glu Leu Ile Ala Pro1 5 10 15 Ala Asn Pro Thr Pro Tyr Glu Phe Lys Gln Leu Ser Asp Val AspAsp 20 25 30 Gln Gln Ser Leu Arg Leu Gln Leu Pro Phe Val Asn Ile Tyr ProHis 35 40 45 Asn Pro Ser Leu Glu Gly Arg Asp Pro Val Lys Val Ile Lys GluAla 50 55 60 Ile Gly Lys Ala Leu Val Phe Tyr Tyr Pro Leu Ala Gly Arg LeuArg 65 70 75 80 Glu Gly Pro Gly Arg Lys Leu Phe Val Glu Cys Thr Gly GluGly Ile 85 90 95 Leu Phe Ile Glu Ala Asp Ala Asp Val Ser Leu Glu Glu PheTrp Asp 100 105 110 Thr Leu Pro Tyr Ser Leu Ser Ser Met Gln Asn Asn IleIle His Asn 115 120 125 Ala Leu Asn Ser Asp Glu Val Leu Asn Ser Pro LeuLeu Leu Ile Gln 130 135 140 Val Thr Arg Leu Lys Cys Gly Gly Phe Ile PheGly Leu Cys Phe Asn 145 150 155 160 His Thr Met Ala Asp Gly Phe Gly IleVal Gln Phe Met Lys Ala Thr 165 170 175 Ala Glu Ile Ala Arg Gly Ala PheAla Pro Ser Ile Leu Pro Val Trp 180 185 190 Gln Arg Ala Leu Leu Thr AlaArg Asp Pro Pro Arg Ile Thr Phe Arg 195 200 205 His Tyr Glu Tyr Asp GlnVal Val Asp Met Lys Ser Gly Leu Ile Pro 210 215 220 Val Asn Ser Lys IleAsp Gln Leu Phe Phe Phe Ser Gln Leu Gln Ile 225 230 235 240 Ser Thr LeuArg Gln Thr Leu Pro Ala His Leu His Asp Cys Pro Ser 245 250 255 Phe GluVal Leu Thr Ala Tyr Val Trp Arg Leu Arg Thr Ile Ala Leu 260 265 270 GlnPhe Lys Pro Glu Glu Glu Val Arg Phe Leu Cys Val Met Asn Leu 275 280 285Arg Ser Lys Ile Asp Ile Pro Leu Gly Tyr Tyr Gly Asn Ala Val Val 290 295300 Val Pro Ala Val Ile Thr Thr Ala Ala Lys Leu Cys Gly Asn Pro Leu 305310 315 320 Gly Tyr Ala Val Asp Leu Ile Arg Lys Ala Lys Ala Lys Ala ThrMet 325 330 335 Glu Tyr Ile Lys Ser Thr Val Asp Leu Met Val Ile Lys GlyArg Pro 340 345 350 Tyr Phe Thr Val Val Gly Ser Phe Met Met Ser Asp LeuThr Arg Ile 355 360 365 Gly Val Glu Asn Val Asp Phe Gly Trp Gly Lys AlaIle Phe Gly Gly 370 375 380 Pro Thr Thr Thr Gly Ala Arg Ile Thr Arg GlyLeu Val Ser Phe Cys 385 390 395 400 Val Pro Phe Met Asn Arg Asn Gly GluLys Gly Thr Ala Leu Ser Leu 405 410 415 Cys Leu Pro Pro Pro Ala Met GluArg Phe Arg Ala Asn Val His Ala 420 425 430 Ser Leu Gln Val Lys Gln ValVal Asp Ala Val Asp Ser His Met Gln 435 440 445 Thr Ile Gln Ser Ala SerLys 450 455 73 445 PRT Arabidopsis thaliana 73 Met Ser Ile Gln Ile LysGln Ser Thr Met Val Arg Pro Ala Glu Glu 1 5 10 15 Thr Pro Asn Lys SerLeu Trp Leu Ser Asn Ile Asp Met Ile Leu Arg 20 25 30 Thr Pro Tyr Ser HisThr Gly Ala Val Leu Ile Tyr Lys Gln Pro Asp 35 40 45 Asn Asn Glu Asp AsnIle His Pro Ser Ser Ser Met Tyr Phe Asp Ala 50 55 60 Asn Ile Leu Ile GluAla Leu Ser Lys Ala Leu Val Pro Phe Tyr Pro 65 70 75 80 Met Ala Gly ArgLeu Lys Ile Asn Gly Asp Arg Tyr Glu Ile Asp Cys 85 90 95 Asn Ala Glu GlyAla Leu Phe Val Glu Ala Glu Ser Ser His Val Leu 100 105 110 Glu Asp PheGly Asp Phe Arg Pro Asn Asp Glu Leu His Arg Val Met 115 120 125 Val ProThr Cys Asp Tyr Ser Lys Gly Ile Ser Ser Phe Pro Leu Leu 130 135 140 MetVal Gln Leu Thr Arg Phe Arg Cys Gly Gly Val Ser Ile Gly Phe 145 150 155160 Ala Gln His His His Val Cys Asp Gly Met Ala His Phe Glu Phe Asn 165170 175 Asn Ser Trp Ala Arg Ile Ala Lys Gly Leu Leu Pro Ala Leu Glu Pro180 185 190 Val His Asp Arg Tyr Leu His Leu Arg Pro Arg Asn Pro Pro GlnIle 195 200 205 Lys Tyr Ser His Ser Gln Phe Glu Pro Phe Val Pro Ser LeuPro Asn 210 215 220 Glu Leu Leu Asp Gly Lys Thr Asn Lys Ser Gln Thr LeuPhe Ile Leu 225 230 235 240 Ser Arg Glu Gln Ile Asn Thr Leu Lys Gln LysLeu Asp Leu Ser Asn 245 250 255 Asn Thr Thr Arg Leu Ser Thr Tyr Glu ValVal Ala Ala His Val Trp 260 265 270 Arg Ser Val Ser Lys Ala Arg Gly LeuSer Asp His Glu Glu Ile Lys 275 280 285 Leu Ile Met Pro Val Asp Gly ArgSer Arg Ile Asn Asn Pro Ser Leu 290 295 300 Pro Lys Gly Tyr Cys Gly AsnVal Val Phe Leu Ala Val Cys Thr Ala 305 310 315 320 Thr Val Gly Asp LeuSer Cys Asn Pro Leu Thr Asp Thr Ala Gly Lys 325 330 335 Val Gln Glu AlaLeu Lys Gly Leu Asp Asp Asp Tyr Leu Arg Ser Ala 340 345 350 Ile Asp HisThr Glu Ser Lys Pro Gly Leu Pro Val Pro Tyr Met Gly 355 360 365 Ser ProGlu Lys Thr Leu Tyr Pro Asn Val Leu Val Asn Ser Trp Gly 370 375 380 ArgIle Pro Tyr Gln Ala Met Asp Phe Gly Trp Gly Ser Pro Thr Phe 385 390 395400 Phe Gly Ile Ser Asn Ile Phe Tyr Asp Gly Gln Cys Phe Leu Ile Pro 405410 415 Ser Arg Asp Gly Asp Gly Ser Met Thr Leu Ala Ile Asn Leu Phe Ser420 425 430 Ser His Leu Ser Arg Phe Lys Lys Tyr Phe Tyr Asp Phe 435 440445 74 446 PRT Arabidopsis thaliana 74 Met Glu Thr Met Thr Met Lys ValGlu Thr Ile Ser Lys Glu Ile Ile 1 5 10 15 Lys Pro Ser Ser Pro Thr ProAsn Asn Leu Gln Thr Leu Gln Leu Ser 20 25 30 Ile Tyr Asp His Ile Leu ProPro Val Tyr Thr Val Ala Phe Leu Phe 35 40 45 Tyr Thr Lys Asn Asp Leu IleSer Gln Glu His Thr Ser His Lys Leu 50 55 60 Lys Thr Ser Leu Ser Glu ThrLeu Thr Lys Phe Tyr Pro Leu Ala Gly 65 70 75 80 Arg Ile Thr Gly Val ThrVal Asp Cys Thr Asp Glu Gly Ala Ile Phe 85 90 95 Val Asp Ala Arg Val AsnAsn Cys Pro Leu Thr Glu Phe Leu Lys Cys 100 105 110 Pro Asp Phe Asp AlaLeu Gln Gln Leu Leu Pro Leu Asp Val Val Asp 115 120 125 Asn Pro Tyr ValAla Ala Ala Thr Trp Pro Leu Leu Leu Val Lys Ala 130 135 140 Thr Tyr PheGly Cys Gly Gly Met Ala Ile Gly Ile Cys Ile Thr His 145 150 155 160 LysIle Ala Asp Ala Ala Ser Ile Ser Thr Phe Ile Arg Ser Trp Ala 165 170 175Ala Thr Ala Arg Gly Glu Asn Asp Ala Ala Ala Met Glu Ser Pro Val 180 185190 Phe Ala Gly Ala Asn Phe Tyr Pro Pro Ala Asn Glu Ala Phe Lys Leu 195200 205 Pro Ala Asp Glu Gln Ala Gly Lys Arg Ser Ser Ile Thr Lys Arg Phe210 215 220 Val Phe Glu Ala Ser Lys Val Glu Asp Leu Arg Thr Lys Ala AlaSer 225 230 235 240 Glu Glu Thr Val Asp Gln Pro Thr Arg Val Glu Ser ValThr Ala Leu 245 250 255 Ile Trp Lys Cys Phe Val Ala Ser Ser Lys Thr ThrThr Cys Asp His 260 265 270 Lys Val Leu Val Gln Leu Ala Asn Leu Arg SerLys Ile Pro Ser Leu 275 280 285 Leu Gln Glu Ser Ser Ile Gly Asn Leu MetPhe Ser Ser Val Val Leu 290 295 300 Ser Ile Gly Arg Gly Gly Glu Val LysIle Glu Glu Ala Val Arg Asp 305 310 315 320 Leu Arg Lys Lys Lys Glu GluLeu Gly Thr Val Ile Leu Asp Glu Gly 325 330 335 Gly Ser Ser Asp Ser SerSer Met Ile Gly Ser Lys Leu Ala Asn Leu 340 345 350 Met Leu Thr Asn TyrSer Arg Leu Ser Tyr Glu Thr His Glu Pro Tyr 355 360 365 Thr Val Ser SerTrp Cys Lys Leu Pro Leu Tyr Glu Ala Ser Phe Gly 370 375 380 Trp Asp SerPro Val Trp Val Val Gly Asn Val Ser Pro Val Leu Gly 385 390 395 400 AsnLeu Ala Met Leu Ile Asp Ser Lys Asp Gly Gln Gly Ile Glu Ala 405 410 415Phe Val Thr Leu Pro Glu Glu Asn Met Ser Ser Phe Glu Gln Asn Pro 420 425430 Glu Leu Leu Ala Phe Ala Thr Met Asn Pro Ser Val Leu Val 435 440 445

We claim:
 1. A purified protein, comprising an amino acid sequenceselected from the group consisting of: SEQ ID NOs: 2, 4, 6, 8, 10, 12,14, 16, 18, 20, 22, 24, 26, 28, 45, 50, 52, 54, 56, and
 58. 2. Aspecific binding agent that binds the protein of claim
 1. 3. An isolatednucleic acid molecule encoding a protein according to claim
 1. 4. Anisolated nucleic acid molecule according to claim 3, further comprisinga sequence selected from the group consisting of: SEQ ID NOs: 1, 3, 5,7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 44, 49, 51, 53, 55, and
 57. 5.A recombinant nucleic acid molecule, comprising a promoter sequenceoperably linked to a nucleic acid molecule according to claim
 3. 6. Acell transformed with a recombinant nucleic acid molecule according toclaim
 5. 7. A transgenic organism, comprising a recombinant nucleic acidmolecule according to claim 5, wherein the transgenic organism isselected from the group consisting of plants, bacteria, insects, fungi,and mammals.
 8. An isolated nucleic acid molecule that: (a) hybridizesunder low-stringency conditions with a nucleic acid probe, the probecomprising a sequence selected from the group consisting of SEQ ID NOs:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 44, 49, 51, 53, 55,and 57 and fragments thereof; and (b) encodes a protein havingtransacylase activity.
 9. A transacylase encoded by the nucleic acidmolecule of claim
 8. 10. A recombinant nucleic acid molecule, comprisinga promoter sequence operably linked to a nucleic acid molecule accordingto claim
 8. 11. A cell transformed with a recombinant nucleic acidmolecule according to claim
 10. 12. A transgenic organism, comprising arecombinant nucleic acid molecule according to claim 10, wherein thetransgenic organism is selected from the group consisting of plants,bacteria, insects, fungi, and mammals.
 13. A specific binding agent,that binds to the transacylase of claim
 9. 14. An isolated nucleic acidmolecule that: (a) has at least 60% sequence identity with a nucleicacid sequence selected from the group consisting of SEQ ID NOs: 1, 3, 5,7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 44, 49, 51, 53, 55, and 57;and (b) encodes a protein having transacylase activity.
 15. A method ofidentifying a nucleic acid sequence, comprising: (a) hybridizing thenucleic acid sequence to at least 10 contiguous nucleotides of asequence selected from the group consisting of SEQ ID NOs: 1, 3, 5, 7,9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 44, 49, 51, 53, 55, and 57; and(b) identifying the nucleic acid sequence as one that encodes atransacylase.
 16. A nucleic acid molecule identified by the method ofclaim
 15. 17. The method of claim 15, wherein hybridizing the nucleicacid sequence is performed under low-stringency conditions.
 18. Atransacylase encoded by the nucleic acid molecule of claim
 16. 19. Aspecific binding agent, that binds the transacylase of claim
 18. 20. Themethod of claim 15, wherein step (a) occurs in a PCR reaction.
 21. Themethod of claim 15, wherein step (a) occurs during library screening.22. The method of claim 15, wherein the isolated nucleic acid sequenceis isolated from the genus Taxus.
 23. A purified protein havingtransacylase activity, comprising an amino acid sequence selected fromthe group consisting of: (a) an amino acid sequence selected from thegroup consisting of SEQ ID NOs: 26, 28, 50, 52, 54, 56, and 58; (b) anamino acid sequence that differs from the amino acid sequence specifiedin (a) by one or more conservative amino acid substitutions; and (c) anamino acid sequence having at least 60% sequence identity to thesequences specified in (a) or (b).
 24. An isolated nucleic acid moleculeencoding a protein according to claim
 23. 25. An isolated nucleic acidmolecule according to claim 24, further comprising a sequence selectedfrom the group consisting of SEQ ID NOs: 25, 27, 49, 51, 53, 55, and 57.26. A recombinant nucleic acid molecule, comprising a promoter sequenceoperably linked to the nucleic acid molecule of claim
 24. 27. A celltransformed with a recombinant nucleic acid molecule according to claim26.
 28. A method for synthesizing a second intermediate in thepaclitaxel biosynthetic pathway, comprising: contacting a firstintermediate with at least one purified transacylase of claim 18; andallowing the transacylase to transfer an acyl group to the firstintermediate, wherein transfer of the acyl group yields the secondintermediate in the paclitaxel biosynthetic pathway.
 29. The method ofclaim 28, wherein the transacylase is expressed in a transgenic organismand the synthesis of the second intermediate occurs in vivo.
 30. Amethod of transferring an acyl group to a taxoid, comprising: contactinga taxoid with at least one transacylase of claim 18; and allowing thetransacylase to transfer an acyl group to the taxoid.
 31. The method ofclaim 30, wherein the transacylase is expressed in a transgenic organismand the synthesis of the taxoid occurs in vivo.
 32. The method of claim30, wherein the taxoid is paclitaxel.
 33. The method of claim 30,wherein the taxoid is baccatin III.
 34. The method of claim 30, whereinthe taxoid is 10-deacetyl-baccatin III.